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A human gene termed APC is dbctaed. Methods and kits are provided 
for assessing mutations of the APC gene in human tissues and body samples. 
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INHERITED AND SOMATIC MUTATIONS OF 
APC GENE IN COLORECTAL CANCER OF HUMANS 

The U.5. Government has a paid-up license in this Invention and 
the right in limited circumstances to require the patent owner to 
license others on reasonable terms as provided for by the terms of 
grants awarded by the National Institutes of Health. 

TECHNICAL AREA OF THE INVENTION 

The invention relates to the area of cancer diagnostics and ther*' 
apeutics. More particularly, the invention relates to detection of the 
germline and somatic alterations of wild-type APC genes. In addition, 
it relates to therapeutic intervention to restore the function of APC 
gene product. 

BACKGROUND OF THE INVENTION 

According to the model of Knudson for tumorigenesis (Cancer 
Research, VoL 43, p. 1482, 198$), there are tumor suppressor genes in 
all normal cells which, when they become non-functional due to muta- 
tion, cause neoplastic development. Evidence for this model has been 
found in the cases of retinoblastoma and colorectal tumors. The impli- 
cated suppressor genes in those tumors, Rfi, pS3, DCC and MCC, were 
found to be deleted or altered in many cases of the tumors studied. 
(Hansen and Cavenee, Cancer Research, VoL. 47 pp. 3518*9527 (1987); 
BaKer et al*, Science, Vol.. 244, p. 217 (1989); Fearon et al., Science, 
Vol. 247, p. 49 (1990); Kinder et al. Science Vol. 251. p. 1366 (1991).) 

in order to fully understand the pathogenesis of tumors, it will 
be necessary to identify the other suppressor genes that play a role in 
the tumorigenesis process. Prominent among these is the one(s) pre* 
sumptively located at 5q21. Cytogenetic (Herrera et al.. Am J. Med. 
Genet. . Vol. is, p. 473 41986) and linkage (Lepport et ah, Science, Vol. 
238, p. 1411 (1987); Bodmer et al., Nature, Vol. 326, p. 614 (1987)) stud- 
ies have shown that this chromosome region harbors the gene 
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responsible for lamilial adenomatous polyposis (FAP) and Gardner's 
Syndrome (GS). FAP is an autosomal-dominant, inherited disease tn 
which affected individuals develop hundreds to thousands of t 
adenomatous polyps, some of which progress to malignancy. GS is a J 
variant of FAP to which desmoid tumors, osteomas and other soft tissue ^ 
tumors occur together with multiple adenomas of the colon and rec- 
tum. A less severe form of polyposis has been identified in which only 
a few (2-40) polyps develop. This condition also is familial and is linked 
to the same chromosomal markers as FAP and GS (Leppert et ai. ( New 
England Journal of Medicine, Vol. 322, pp. 904-308, 1990.) Additionally, 
this chromosomal region Is often deleted from the adenomas 
(Vogelstein et a!.. N. EngL J. Med.. VoL 319, p. 325 (1988)) and carelno- . 
mas (Vogelstein et au N. Engl. J. Med., Vol. 319. p. 523 (1988): Solomon 
et al. Nature, Vol. 328, p. 616 (1987); Sasaki et ai., Cancer Research, 
Vol. 49, p. 4402 (1989); Delattre et aU Lancet, Vol. 2, p. 333 (1989); and 
Ashton-Rickardt et ai.. Oncogene. VoL 4. p. 1169 (1989)) of patients 
without FAP (sporadic tumors). Thus, a putative suppressor gene on 
chromosome 5q2l appears to play a role in the early stages of 
colorectal neoplasia in both sporadic and familial tumors. 

Although the MCC gene has been identified on 3q21 as a candi- 
date suppressor gene, it does not appear to be altered in FAP or GS 
patients. Thus there is a need in the art for investigations of this chro- 
mosomal region to identify genes and to determine if any of such genes 
are associated with FAP and/or GS and the process of tumorigenesis. 
SUMMARY OF THE INVENTION 

It is an object of the present invention to provide a method lor 
diagnosing and prognosing a neoplastic tissue of a human. 

It is another object of the invention to provide a method of 
detecting genetic predisposition to cancer. 

It is another object of the invention to provide a method of sup- # 
plying wild-type APC gene function to a cell which has lost said gene J 
function. • 

It is yet another object of the invention to provide a kit for 
determination of the nucleotide sequence or APC alleles by the 
polymerase chain reaction. 
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It is still another object of the invention to provide nucleic acid 
probes for detection of mutations In tne human A PC gene. 

H is still another object of the invention to provide a cDNA mol- 
ecule encoding the A PC gene product. 

It is yet another object of the invention to provide a preparation 
of che human APC protein. 

It is another object or the Invention to provide a method of 
screening for genetic predisposition to cancer. 

It is an object of the invention to provide methods of testing 
therapeutic agents lor the ability to suppress neoplasia. 

It is still another object of the invention to provide animals car* 
rying mutant APC alleles. 

These and other objects of the invention are provided by one or 
more of the embodiments which are described below, in one embodi- 
ment of the present invention a method of diagnosing or prognosing a 
neoplastic tissue of a human Is provided comprising: detecting somatic 
alteration of wild-type APC genes or their expression products In a 
sporadic colorectal cancer tissue, said alteration indicating neoplasia of 
the tissue. 

In yet another embodiment a method is provided of detecting 
genetic predisposition to cancer in a human including familial 
adenomatous polyposis (FAP) and Gardner's Syndrome (GS), comprising: 
isolating a human sample selected from the group consisting of blood 
and fetal tissue; detecting alteration of wild-type APC gene coding 
sequences or their expression products from the sample, said alteration 
indicating genetic predisposition to cancer. 

In another embodiment of the present invention a method is 
provided for supplying wild-type APC gene function to a cell which has 
lost said gene function by virtue of a mutation in the APC gene, com- 
prising: introducing a wild-type APC gene into a cell which has lost 
said gene function such that said wild-type gene is ex-pressed in the 
celL 

In another embodiment a method of supplying wild- type APC 
gene function to a cell is provided comprising: introducing a portion of 
a wild-type APC gene into a cell which has lost said gene function such 
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that said portion is expressed in the cell, said portion encoding a part 
of the APC protein which is required for non-neoplastic growth ol said 
ceil. APC protein can also be applied to cells or administered to ani- 
mals to remediate for mutant APC genes. Synthetic peptides or drugs 
can also be used to mimic APC function in cells which have altered 
APC expression. 

In yet another embodiment a pair of single stranded primers is 
provided for determination of the nucleotide sequence of the APC gene 
by polymerase chain reaction. The sequence of said pair of single 
stranded DNA primers Is derived from chromosome Sq band 21, said 
pair of primers allowing synthesis of APC gene coding sequences. 

In still another embodiment of the invention a nucleic add probe . 
Is provided which is complementary to human wild-type APC gene cod- 
ing sequences and which can form mismatches with mutant APC genes, 
thereby allowing their detection by enzymatic or chemical cleavage or 
by shifts in electrophoretic mobility. 

In another embodiment of the invention a method is provided for 
detecting the presence of a neoplastic tissue in a human. The method 
comprises isolating a body sample from a human; detecting in said sam- 
ple alteration of a wild-type APC gene sequence or wild-type APC 
expression product* said alteration Indicating the presence of a 
neoplastic tissue in the human. 

In still another embodiment a cDNA molecule is provided which 
comprises the coding sequence of the APC gene. 

In even another embodiment a preparation of the human APC 
protein is provided which is substantially free of other human proteins. 
The amino add sequence of the protein is shown in Figure 3 or ?. 

In yet another embodiment of the invention a method is provided 
for screening for genetic predisposition to cancer, including familial 
adenomatous polyposis (f AP) and Gardners Syndrome (OS), in a human. 
The method comprises: detecting among kindred persons the presence 
of a DNA polymorphism which is linked to a mutant APC allele in an 
individual having a genetic predisposition to cancer, said kindred being 
genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 
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In another embodiment of the invention a method of testing 
therapeutic agents for the ability to suppress a neoplastically trans* 
formed phenotype is provided. The method comprises: applying a test 
substance to a cultured epithelial cell which carries a mutation in an 
APC allele; and determining whether said test substance suppresses 
the neoplastically transformed phenotype of the cell. 

In another embodiment of the invention a method of testing 
therapeutic agents for the ability to suppress a neoplastically trans* 
formed phenotype is provided. The method comprises: administering a 
test substance to an animal which carries a mutant APC allele: and 
determining whether said test substance prevents or suppresses the 
growth of tumors. 

In still other embodiments of the invention transgenic animals 
are provided. The animals carry a mutant APC allele from a second 
animal species or have been genetically engineered to contain an inser- 
tion mutation which disrupts an APC allele. 

The present invention provides the art with the information that 
the APC gene, a heretofore unknown gene is. in fact, a target of muta- 
tional alterations on chromosome Sq2i and that these alterations are 
associated with the process of tumorigenesis. This information allows 
highly specific assays to be performed to assess the neoplastic status of 
a particular tissue or the predisposition to cancer of an individual. This 
invention has applicability to Familial Adenomatous Polyposis, sporadic 
colorectal cancers, Gardner* Syndrome, as well as the less severe 
familial polyposis discusses above. 

BRIEF DESCRIPTION Oh THE DRAWftlfi S 

Figure 1A shows an overview of yeast artificial chromosome 
(YAC) contlgs. Genetic distances between selected RFLP markers 
from within the contlgs are shown in centlMorgans. 

Figure IB shows a detailed map of the three central contigs. 
The position of the six identified genes from within the FAP region is 
shown: the S 1 and 3' ends of the transcripts from these genes have in 
general not yet been isolated, as indicated by the string of dots sur- 
rounding the bars denoting the genes* positions. Selected restriction 
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endonuclease recognition sites are Indicated.- B, BssH2; S, SstD; 

M f Mlul; N, Nrul. 

Figure 2 shows the sequence of TBI and TB2 genes. The cDNA 
sequence of the TBI gene was determined from the analysis of il 
cDNA clones derived from normal colon and liver, as described In the 
text. A total of 2314 bp were contained within the overlapping cDKA 
clones, defining an ORF of 424 amino adds beginning at nucleotide 1. 
Only the predicted amino acids from the ORF are shown. The 
carboxy-terminal end of the ORF- has apparently been identified, but 
the 5 1 end of the TBI transcript has not yet been precisely determined. 

The cDKA sequence of the TB2 gene was determined from the 
YS-39 clone derived as described in the text. This done consisted of 
2300 bp and defined an ORF of 183 amino adds beginning at nucleotide 
1. Only the predicted amino acids are shown. The carboxy terminal 
end of the ORF has apparently been identified, but the 5' end of the 
TB2 transcript has not been precisely determined. 

Figure 3 Shows the sequence of the APC gene product. The 
cDNA sequence was determined through the analysis of 87 cDNA clones 
derived from normal colon, liver, and brain. A total of 8973 bp were 
contained within overlapping dDNA clones, defining an ORF of 2842 
amino adds, in frame stop codons surrounded this ORF, as described in 
the text, suggesting that the entire APC gene product was represented 
in the ORF Illustrated. Only the predicted amino acids are shown. 

Figure 4 shows the local similarity between human APC and ra!2 
of yeast. Local similarity among the APC and MCC genes and the m3 
muscarinic acetylcholine receptor is shown. The region of the mAChR 
shown corresponds to that responsible for coupling the receptor to G 
proteins. The connecting lines indicate identities; dots indicate related 
amino adds residues. 

Figure 3 shows the genomic map of the 1200 kb NotI fragment at 
the FAP locus. The NotI fragment is shown as a bold line. Rdevant 
parts of the deletion chromosomes from patients 3214 and 3824 are 
shown as stippled lines. Probes used to characterize the Not! fragment 
and the deletions, and three YACs from which subclones were obtained, 
are shown below the restriction map. The chimeric end of YAC 
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183H12 is indicated by a dotted line. The orientation and approximate 
position of MCC are indicated above the map. 

Figure 6 shows the DNA sequence and predicted amino acid 
sequence of DPI (TB2). The nucleotide numbering begins at the most 5' 
nucleotide isolated. A proposed initiation methionine (base 77) is indi- 
cated in bold type. The entire coding sequence is presented. 

Figure 7 shows the cdna and predicted amino acid sequence of 
DP2.5 (APC). The nucleotide numbering begins at the proposed initia- 
tion methionine. The nucleotides and amino acids of the alternatively 
spliced-exon (exon 9; nucleotide positions 934*1236) are presented In 
lower case letters. At the 3' end* a polytA) addition signal occurs at 
9330, and one cDNA clone has a polyfA) at 9563. Other cDNA clones 
extend beyond 9563, however, and their consensus sequence is included 
here. 

Figure 8 shows the arrangement of exons in DP2.5 (APC). 
(A) Exon 9 corresponds to nucleotides 933*1312; exon 9a corresponds to 
nucleotides 1236-1312. The stop codon In the cDNA is at nucleotide 
8535. IB) Partial intronic sequence surrounding each exon is shown. 
DETAILED DESCRIPTION 

It is a discovery of the present invention that mutational events 
associated with tumodgenesls occur in a previously unknown gene on 
chromosome 5q named here the APC (Adenomatous Polyposis Coll) 
gene. Although it was previously known that deletion of alleles on 
chromosome 5q were common in certain types of cancers, it was not 
known that a target gene of these deletions was the APC gene. Fur- 
ther it was not known that other types of mutational events in the APC 
gene are also associated with cancers. The mutations of the APC gene 
can involve gross rearrangements, such as insertions and deletions. 
Point mutations have also been observed. 

According to the diagnostic and prognostic method of the 
present invention, alteration of the wild-type APC gene is detected. 
"Alteration of a wild-type gene M according to the present invention 
encompasses all forms of mutations - including deletions. The alter- 
ation may be due to either rearrangements such as insertions, inver- 
sions, and deletions, or to point mutations. Deletions may be of the 
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entlre gene or only a portion of the gene. Somatic mutations are those 
which occur only in certain tissues, e.g., in the tumor tissue, and are 
not inherited in the germline. Germiine mutations can he found In any « 
of a body* tissues. If only a single allele is somatically mutated, an \ 
early neoplastic state Is Indicated. However, if both alleles are « 
mutated then a late neoplastic state is indicated The finding of APC 
mutations thus provides both diagnostic and prognostic Information. 
An APC allele which is not deleted (e-gi that on the sister chromosome 
to a chromosome carrying an APC deletion) can be screened for other 
mutations, such as insertions, small deletions, and point mutations. It 
is believed that many mutations found in tumor tissues will be those 
leading to decreased expression of the APC gene product. However, . 
mutations leading to non-functional gene products would also lead to a 
cancerous state. Point mutational events may occur in regulatory 
regions, such as in the promoter of the gene, leading to loss or diminu- 
tion of expression of the mRNA. Point mutations may also abolish 
proper RNA processing, loading to loss of expression of the APC gene 
product. 

In order to detect the alteration of the wild-type APC gene In a 
tissue, it is helpful to isolate the tissue free from surrounding normal 
tissues. Means for enriching a tissue preparation for tumor cells are 
known In the art. For example, the tissue may be isolated from paraf- 
fin or cryostat sections. Cancer cells may also be separated from nor- 
mal cells by flow cytometry. These as well as other techniques for 
separating tumor from normal cells are well known in the art. If the 
tumor tissue is highly contaminated with normal cells, detection of 
mutations is more difficult. 

Detection of point mutations may be accomplished by molecular 
cloning of the APC allele (or alleles) and sequencing that allele(s) using 
techniques well known in the art Alternatively, the polymerase chain # 
reaction (PCR) can be used to amplify gene sequences directly from a * 
genomic DMA preparation from the tumor tissue. The ONA sequence 4 
of the amplified sequences can then be determined. The polymerase 
chain reaction Itself is well known In the art. See, e.g., Saiki et al., 
Science, Vol. 239, p. 487, 1988; U.S. 4,683,203; and US. 4,683,195. 
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Speclflc primers which can be used in order to amplify the gene win 
be discussed in more detail below. The llgase chain reaction* which is 
known In the art, can also be used to amplify A PC sequences. See Wu 
et ah, Genomics . Vol. 4, pp. 560-569 (1989). In addition, a technique 
known as allele specific PCR can be used. (See Ruano and Kidd, 
Nucleic Acids Research, VoL 17, p. 8392, 1989J According to this 
technique, primers are used which hybridize at their V ends to a par- 
ticular APC mutation. If the particular APC mutation is not present, 
an amplification product Is not observed. Amplification Refractory 
Mutation~System (ARMS) can also be used as~<UsclQ5ed in~European 
Patent AppUcation Publication No. 0332435 and In Newton et al.. 
Nucleic Adds Research, Vol. 17, p.7, 1989. Insertions and deletions of 
genes can also be detected by cloning, sequencing and amplification. In 
addition, restriction fragment length polymorphism (RFLP) probes for 
the gene or surrounding marker genes can be used to score alteration 
of an allele or an insertion in a polymorphic fragment. Such a method 
is particularly useful for screening among kindred persons of an 
affected individual for the presence of the APC mutation found in that 
individual. Single stranded conformation polymorphism (SSCP) analysis 
can also be used to detect base change variants of an allele. (Orita et 
al.. Proc. Natl. Acad. ScL USA VoL 86, pp. 2766-2770, 1989, and 
Genomics, Vol. 8, pp. 874*879. 1989J Other techniques for detecting 
insertions and deletions as are known in the art can be used. 

Alteration of wild-type genes can also be detected on the basis 
of the alteration of a wild-type expression product of the gene. Such 
expression products include both the APC mRNA as well as the Apo- 
protein product. The sequences of these products are shown in 
Figures 3 and 7. Point mutations may be detected by amplifying and 
sequencing the siRNA or via molecular cloning of cDNA made from the 
mRNA. The sequence of the cloned cDNA can be determined using 
ONA sequencing techniques which are well known in the art. The 
cdna can also be sequenced via the polymerase chain reaction (PCR) 
which will be discussed in more detail below. 

Mismatches, according to the present invention are hybridized 
nucleic acid duplexes which are not 100% homologous. The lack of 
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total homology may be due to deletions, insertions. Inversions, substitu- 
tions or frameshlft mutations. Mismatch detection can be used to 
detect point mutations in the gene or its mRKA product. While these < 
techniques are less sensitive than sequencing, they are simpler to per* f 
form on a large number of tumor samples. An example of a mismatch « 
cleavage technique is the RNase protection method, which is described 
in detail in Winter et aL, Proc. NatL Acad. Sci. USA, Vol. 82, p. 7575, 
1985 and Meyers et aL, Science, VoL 230, p. 1242, 1985. In the practice 
of the present invention the method involves the use of a labeled 
riboprobe which is complementary to the human wild-type APC gene 
coding sequence. The riboprobe and either mRKA or DNA isolated 
from the tumor tissue are annealed (hybridized) together and subse- • 
quently digested with the enzyme RNase A which is able to detect 
some mismatches in a duplex RNA structure. If a mismatch is detected 
by RNase A, it cleaves at the site of the mismatch. Thus, when the 
annealed RNA preparation is separated on an electrophoretic gel 
matrix, if a mismatch has been detected and cleaved by RNase A, an 
RNA product wiu be seen which is smaller than the full-length duplex 
RNA for the riboprobe and the mRNA or DNA. The riboprobe need not 
be the full length of the APC mRNA or gene but can be a segment of 
either. If the riboprobe comprises only a segment of the APC mRNA or 
gene it win be desirable to use a number of these probes to screen the 
whole mRNA sequence for mismatches. 

In similar fashion, DNA probes can be used to detect mis- 
matches, through enzymatic or chemical cleavage. See, eg.. Cotton et 
aL, Proc. NatL Acad. ScL USA, Vol. 85, 4397, 1988; and Shenk et al., 
Proc. Natl. Acad. ScL USA, Vol. 72, p. 989, 1975. Alternatively, mis* 
matches can be detected by shifts in the electrophoretic mobility of 
mismatched duplexes relative to matched duplexes. See, e<., Carielio, 
Human Genetics, Vol. 42, p. 726, 1988. With either rlboprobes or DNA # 
probes, the cellular mRNA or DNA which might contain a mutation can 
be amplified using PCR (see below) before hybridization. Changes in 4 
DNA of the APC gene can also be detected using Southern hybridiza- 
tion, especially if the changes are gross rearrangements, such as dele- 
tions and insertions. 
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DNA sequences of the A PC gene whiclrhave been amplified by 
use ol polymerase chain reaction may also be screened using aliele-spe- 
cific probes. These probes are nucleic acid oligomers, each of which 
contains a region of the APC gene sequence harboring a known muta- 
tion. For example, one oligomer may be about SO nucleotides in length, 
corresponding to a portion of the APC gene sequence. By use of a bat- 
tery of such allele-speclf ic probes, PCR amplification products can be 
screened to Identify the presence of a previously identified mutation in 
the APC gene. Hybridization of aliele-specific probes with amplified 
APC sequences can be performed, for example, on a nylon filter. 
Hybridization to a particular probe under stringent hybridization condi- 
tions indicates the presence of the same mutation in the tumor tissue 
as in the aliele-specific probe. 

Alteration of APC mRNA expression can be detected by any 
technique known in the art. These include Northern blot analysis, PCR 
amplification and RNase protection. Diminished mRNA expression 
indicates an alteration of the wild-type APC gene. 

Alteration of wild-type APC genes can also be detected by 
screening for alteration of wild-type APC protein. For example, 
monoclonal antibodies immunoreactive with APC can be used to screen 
a tissue. Lack of cognate antigen would indicate an APC mutation. 
Antibodies specific for products of mutant alleles could also be used to 
detect mutant APC gene product. Such immunological assays can be 
done in any convenient format known in the art. These include West- 
ern blots, ImmuDohistochemlcal assays and ELISA assays. Any means 
for detecting an altered APC protein can be used to detect alteration 
of wild-type APC genes. Functional assays can be used, such as protein 
binding determinations. For example, it is believed that APC protein 
oiigomerizes to itself and/or MCC protein or binds to a C protein. 
Thus, an assay for the ability to bind to wild type APC or MCC protein 
or that C protein can be employed. In addition, assays can be used 
which detect APC biochemical function. It is believed that APC is 
involved in phospholipid metabolism. Thus, assaying the enzymatic 
predicts of the involved phospholipid metabolic pathway can be used to 
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determine APC activity. Finding a mutant APC gene product indicates 
alteration of a wild-type APC gene. 

Mutant APC genes or gene products can also be detected in 
other human body samples, such as, serum, stool, urine and sputum, I 
The same techniques discussed above (or detection of mutant APC 
genes or gene products in tissues can be applied to other body samples. 
Cancer cells are sloughed off from tumors and appear in such body 
samples. In addition, the APC gene product itself may be secreted Into 
the extracellular space and found in these body samples even in the 
absence of cancer cells. By screening such body samples, a simple 
early diagnosis can be achieved for many types of cancers. In addition, 
the progress or chemotherapy or radiotherapy can be monitored more 
easily by testing such body samples for mutant APC genes or gene 
products. 

The methods of diagnosis of the present invention are applicable 
to any tumor In which APC has a role in tumorigenesis. Deletions of 
chromosome arm Sq have been observed in tumors of lung, breast, 
colon, rectum, bladder, liver, sarcomas, stomach and prostate, as well 
as in leukemlas and lymphomas* . Thus these are likely to be tumors In 
which APC has a role. The agnostic method of the present invention 
is useful for clinicians so that they can decide upon an appropriate 
course of treatment. For example, a tumor displaying alteration of 
both APC alleles might suggest a more aggressive therapeutic regimen 
than 8 tumor displaying alteration of only one APC allele. 

The primer pairs of the present invention are useful for determi- 
nation of the nucleotide sequence of a particular APC allele using the 
polymerase chain reaction. The pairs of single stranded DNA primers 
can be annealed to sequences within or surrounding the APC gene on 
chromosome 3q in order to prime amplifying DNA synthesis of the APC 
gene Itself. A complete set of these primers allows synthesis of all of 
the nucleotides of the APC gene coding sequences. Le., the exons. The 
set of primers preferably allows synthesis of both intron and exon , 
sequences. Allele specific primers can also be used. Sucn primers 
anneal only to particular APC mutant alleles, and thus will only amplify 
a product in the presence of the mutant allele as a template. 
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ln order to facilitate subsequent cloning of amplified sequences, 
primers may have restriction enzyme site sequences appended to their 
5' ends. Thus, all nucleotides of the primers are derived from APC 
sequences or sequences adjacent to APC except the few nucleotides 
necessary to form a restriction enzyme site. Such enzymes and sites 
are well known in the art. The primers themselves can he synthesized 
using techniques which are well known in the art. Generally, the prim- 
ers can be made using oligonucleotide synthesizing machines which are 
commercially available. Given the sequence of the APC open reading 
frame shown in Figure 7, design of particular primers is well within the 
skill of the art. 

The nucleic acid probes provided by the present Invention are 
useful for a number of purposes. They can be used in Southern hybrid* 
Ization to genomic DNA and in the RNase protection method for 
detecting point mutations already discussed above. The probes can be 
used to detect PGR amplification products. They may also be used to 
detect mismatches with the APC gem or mRNA using other tech- 
niques. Mismatches can be detected using either enzymes (e.g, t Si 
nuclease), chemicals (e*. f bydroxyiamine or osmium tetroxide and 
piperldine), or changes in electrophoretic mobility of mismatched 
hybrids as compared to totally matched hybrids. These techniques are 
known in the art. See, Cotton, supra . Shenk, supra . Myers, supra . Win- 
ter, si2H. and Noveck et ai. f Proc. Natl. Acad. Sd. USA, Vol. S3, p. 
386, 1986. Generally, the probes are complementary to APC gene cod- 
ing sequences, although probes to certain introns are also contem- 
plated. An entire battery of nucleic acid [* -obes is used to compose a 
kit for detecting alteration of wild-type APC genes. The kit allows for 
hybridization to the entire APC gene. The probes may overlap with 
each other or be contiguous. 

If a riboprobe is used to detect mismatches with mRNA, it is 
complementary to the mRNA of the human wild- type APC gene. The 
riboprobe thus is an anti-sense probe in that it does not code for the 
APC protein because it is of the opposite polarity to the sense strand. 
The riboprobe generally will be labeled with a radioactive, 
colorimetric. or fluorometric material, which can be accomplished by 
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any means known in the art. If the riDoprobe is used to detect mis- 
matches with DNA It can be of either polarity, sense or anti-sense. 
Similarly, DNA probes also may be used to detect mismatches. 

Kucleic add probes may also be complementary to mutant 
alleles of the APC gene. These are useful to detect similar mutations 
In other patients on the basis of hybridization rather than mismatches. 
These are discussed above and referred to as ailele-specific probes. As 
mentioned above, the APC probes can also be used in Southern hybrid- 
izations to genomic DNA to detect grass chromosomal changes such as 
deletions and insertions. The probes can also be used to select cdna 
clones of APC geoes from tumor and normal tissues. In addition, the 
probes can be used to detect APC mRNA in tissues to determine if 
expression is diminished as a result of alteration of wild-type APC 
genes. Provided with the APC coding sequence shown In Figure 7 (SEQ 
ID NO: 1), design of particular probes is well within the skill of the 
ordinary artisan. 

According to the present invention a method is also provided of 
supplying wild-type APC function to a ceU which carries mutant APC 
alleles. Supplying such function should suppress neoplastic growth of 
the recipient cells. The wild- type APC gene or a part of the gene may 
be introduced into the cell in a vector such that the gene remains 
extrachromosomal. in such a situation the gene win be expressed by 
the cell from the extrachromosomal location. If 8 gene portion Is 
Introduced and expressed in a cell carrying a mutant APC allele, the 
gene portion should encode a part of the APC protein which is required 
for non-neoplastic growth of the celL More preferred Is the situation 
where the wild-type APC gene or a part of It is Introduced Into the 
mutant oell in such a way that it recombines with the endogenous 
mutant APC gene present in the cell. Such recombination requires a 
double recombination event which results In the correction of the APC 
gene mutation. Vectors for introduction of genes both for recombina- 
tion and for extrachromosomal maintenance are known in the art and 
any suitable vector may be used. Methods for introducing DNA into 
cells such as electroporation, calcium phosphate co-precipitation and 
viral transduction are known in the art and the choice of method is 
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wtthin the competence of the routineer. Cells transformed with the 
wild-type APC gene can be used as model systems to study cancer 
remission and drug treatments which promote such remission. 

Similarly, ceils and animals which carry a mutant APC allele can 
be used as model systems to study and test for substances which have 
potential as therapeutic agent*. The cells are typically cultured 
epithelial cells. These may be isolated from individuals with APC 
mutations, either somatic or gennline. Alternatively, the cell line can 
be engineered to cany the mutation in the APC allele. After a test 
substance is applied to the cells, the neoplasticaUy transformed pheno- 
type of the cell will be determined. Any trait of neoplasticaUy trans- 
formed cells can be assessed. Including anchorage-Independent growth,, 
tumorigeniclty in nude mice, invasiveness of cells, and growth factor 
dependence. Assays for each of these traits are known in the art. 

Animals for testing therapeutic agents can be selected after 
mutagenesis of whole animals or after treatment of germline cells or 
zygotes. Such treatments include insertion of mutant APC alleles, usu- 
ally from a second animal species, as well as insertion of disrupted 
homologous genes. Alternatively, the endogenous APC genets) of the 
animals may be disrupted by insertion or deletion mutation. After test 
substances have been administered to the animals, the growth of 
tumors must be assessed. If the test substance prevents or suppresses 
the growth of tumors, then the test substance is a candidate therapeu- 
tic agent for the treatment of PAP and/or sporadic cancers. 

Polypeptides which have APC activity can be supplied to cells 
which carry mutant or missing APC alleles. The sequence of the APC 
protein is disclosed in Figure i or 7 (SEQ ID NO:-7 or 1). These two 
sequences differ slightly and appear to be indicate the existence of two 
different forms or the APC protein. Protein can be produced by 
expression of the cDNA sequence in bacteria, for example, using known 
expression vectors. Alternatively, APC can be extracted from APC- 
producing mammalian cells such as brain cells, in addition, the tech- 
niques or synthetic chemistry can be employed to synthesize APC pro- 
tein. Any of such techniques can provide the preparation of the 
present Invention which comprises the APC protein. The preparation 
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Is substantially free of other human proteins. This Is most readily 
accomplished by synthesis in a microorganism or in vitro. 

Active APC molecules can be introduced into cells by 
microinjection or by use of liposomes, for example- Alternatively, 
some such active molecules may be taken up by cells* actively or by 
diffusion. Extracellular application of APC gene product may be suffi- 
cient to affect tumor growth. Supply o f Mecules with APC activity 
should lead to a partial reversal of the neoplastic state. Other mole- 
cules with APC activity may also be used to effect such a reversal, for 
example peptides, drugs, or organic compounds. 

The present invention also provides a preparation of antibodies 
immunoreaedve with a human APC protein. The antibodies may be 
polyclonal or monoclonal and may be raised against native APC pro- 
tein, APC fusion proteins, or mutant APC proteins. The antibodies 
should be immunoreactive with APC epitopes, preferably epitopes not 
present on other human proteins. In a preferred embodiment of the 
invention the antibodies will immunoprecipltate APC proteins from 
solution as well as react with APC protein on Western or immunoblou 
of polyacrylamide gels. In another preferred embodiment, the antibod- 
ies will detect APC proteins in paraffin or frozen tissue sections, using 
immunocytochemical techniques. Techniques for raising and purifying 
antibodies are well known in the art and any such techniques may be 
chosen to achieve the preparation of the invention. 

Predisposition to cancers as in FAP and GS can be ascertained 
by testing any tissue of a human for mutations of the APC gene. For 
example, a person who has inherited a germline APC mutation would be 
prone to develop cancers. This can be determined by testing ON A from 
any tissue of the person's body. Most simply, blood can be drawn and 
DNA extracted from the cells of the blood. In addition, prenatal diag- 
nosis can be accomplished by testing fetal cells, placental cells, or « 
amniotic fluid lor mutations of the APC gene. Alteration of a wild- 
type APC allele, whether for example, by point mutation or by dele- 
tion, can be detected by any of the means discussed above. 

Molecules of cDNA according to the present invention are 
intron-free, APC gene coding molecules. They can be made by reverse 
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transcriptase using the APC mRNA as a template. These molecules 
can be propagated in vectors and cell lines as is known In the art. Such 
molecules have the sequence shown in SCQ ID NO: 7. The cDNA can 
also be made using the techniques or synthetic chemistry given the 
sequence disclosed herein. 

A short region of homology has been identified between APC and 
the human m3 muscarinic acetylcholine receptor (mAChR). This 
homology was largely confined to 29 residues in which 6 out of 7 amino 
acids (EUGorA)GLQA) were tde ral (See Figure 4). Initially, it was 
not known whether this homology s significant, because many other 
proteins had higher levels of global homology (though few had six out of 
seven contiguous amino adds in common). However, a study on the . 
sequence elements controlling G protein activation by mAChR subtypes 
(Lechleiter et al., EMBO J., p. 4381 (1990)) has shown that a 21 amino 
acid region from the m3 mAChR completely mediated G protein speci- 
ficity when substituted for the 21 amino acids of m2 mAChR at the 
analogous protein position. These 21 residues overlap the 19 amino acid 
homology between APC end ms mAChR. 

This connection between APC and the G protein activating 
region of mAChR is intriguing' in light of previous investigations relet* 
ing G proteins to cancer. For example, the RAS oncogenes, which are 
often mutated in colorectal cancers (Vogeisteln, et al.. K. Engl. J. 
Med., Vol. 319, p. 525 (1983); Bos et Nature Vol. 327, p. 293 (1987)) V 
are members of the G protein family (Bourne, et al.. Nature, Vol. 348, 
p. 125 (1990)) as is an in vitro transformation suppressor (Koda et al., 
Proc. Natl. Acad. S4. USA, Vol. 86, p. 162 (19891) and genes mutated in 
hormone producing tumors (Candis et al.. Nature, Vol. 340, p. 692 
(1989); Lyons et al.. Science, Vol. 249, p. 653 (1990)). Additionally, the 
gene responsible for neurofibromatosis (presumably a tumor suppressor 
gene) has been shown to activate the GTPase activity of RAS (Xu et al., 
Ceil, Vol. 63, p. 835 (1990h Martin et aW Cell, Vol. 68, p. 843 (1990); 
Ballester et aL, Cell, VoL 63, p. 891 (1990)). Another interesting link 
between G proteins and colon cancer involves the drug sullndac. This 
agent has been shown to inhibit the growth of benign colon tumors in 
patients with FAP, presumably by virtue of Its activity as a 
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cydooxygenase Inhibitor (Waddeil et al. f J. Surg. Oncology 24(1), 83 
(1983); Wadell, et aL, Am. J. Surg.. 157(1), 175 (1989); Charneau et aL, 
Castroenterologie CUnique at Blologique 14(2). 153 (1990)). 
Cydooxygenase is required to convert arachldonic add to 
prostaglandins and other biologically active molecules. 0 proteins are 
known to regulate phaspholipase A2 activity, which generates 
arachidonlc acid from phospholipids (Role et aL, Proc. Natl. Acad. ScL 
USA, VOL 84, p. 8623 (1987); Karachi et aL, Nature, VoL 337, 12 555 
(1989)). Therefore we propose that wil^type APC protein functions by 
interacting with a 0 protein and involved in phospholipid 
metabolism. 

The following are provided for exemplification purposes only and 
are not intended to limit the scope of the invention which has been 
described in broad terms above. 
Example 1 : 

This example demonstrates the isolation of a 5.5 Mb region of 
human DNA linked to the FAP locus. Six genes are identified in this 
region, all of which are expressed in normal colon cells and in 
colorectal, lung, ad bladder tumors. 

The cosmid markers TN5.64 and YN5.48 have previously been 
shown to delimit an 8 cM region containing the locus for FAP 
(Nakamura et aL, Am. J. Hum. Genet- Vol. 43. p. 638 (1988)). Further 
linkage and pulse-field gel electrophoresis (PFOE) analysis with addi- 
tional markers has shown that the FAP locus is contained within a 4 cM 
region bordered by cosmltis EF9.44 and L3.99. In order to isolate clones 
representing a significant portion of this locus, a yeast artificial chro- 
mosome (YAC) library was screened with various 5q2i markers. 
Twenty-one YAC clones, distributed within six contigs and including 
5.5 Mb from the region between YK5.64 and YK5.48, were obtained 
(Figure IA). 

Three contigs encompassing approximately 4Mb were contained 
within the central portion of this region. The YAC? constituting these * 
contigs, together with the markers used for their isolation and orienta- 
tions, are shown in Figure 1. These YAC contigs were obtained In the 
following way. To initiate each contig, the sequence of a genomic 
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marker cloned from chromosome Sq2l was determined and used to 
design primers for PCR. PCR was then carried out on pools of YAC 
clones distributed in microti ter trays as previously described (Anand 
et al. t Nucleic Acids Research, Vol. 18, p. 1951 (1980)). Individual YAC 
clones from the positive pools were identified by further PCR or 
hybridization based assays, and the YAC sizes were determined by 
PFGE. 

To extend the areas covered by the original YAC clones* "chro- 
mosomal walking 11 was performed For this purpose, YAC termini were 
isolated by a PCR based method and sequenced (Riley et aL 9 Nucleic 
Acids Research, Vol. 18, p. 288? U990)). PCR primers based on these 
sequences were then used to rescreen the YAC library. For example, . 
the sequence from an iotron of the FEB gene (Hao et al., Mol. Cell. 
fitoL. Vol. 9, p. 1587 (1989)) was used to design PCR primers for isola- 
tion of the 28EC1 and 5EH8 YACs. The termini of the 28EC1 YAC 
were sequenced to derive markers RHE28 and LHE28, respectively. 
The sequences of these two markers were then used to isolate YAC 
clones 15CH12 (from RHE28) and 40CF1 and 29EF1 (from LKE28). 
These five YACs formed a contig encompassing 1200 kb (contig 1, 
Figure IB). 

Similarly, contig 2 was initiated using cosmid N5.66 sequences, 
and contig 3 was initiated using sequences both from the MCC gene and 
from cosmid EFS.44. A walk in the telomeric direction from YAC 
14FH1 and a walk in the opposite direction from YAC 39GG3 allowed 
connection of the initial contig 3 clones through YAC 37HG4 
(Figure IB). 

Multipoint linkage analysis with the various markers used to 
define the contigs, combined with PFGE analysis, showed that contigs l 
and 2 were centromeric to contig 3. These contigs were used as tools 
to orient and/or identify genes which might be responsible for FAP. 
Six genes were found to lie within this cluster of YACs, as follows: 

Contig 81: FER - The FER gene was discovered through its 
homology to the viral oncogene ABL (Hao et al., supra ), it has an 
intrinsic tyrosine kinase activity, and in situ hybridization with an FER 
probe showed that the gene was located at 5qll-23 (Morris et al., 
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Cytogenet. Cell. Genet., Vol. 53, p. 4, (1990)). Because of the potential 
role of this oncogene-retared gene in neoplasia, we decided to evaluate 
it further with regards to the FAP locus. A human genomic done from 
FER was isolated (MF 2.3) and used to define a restriction fragment 
length polymorphism (RFLP), and the RFLP in turn used to map FER by 
linkage analysis using a panel of three generation families. This 
showed that FER was very tightly linked to previously defined 
polymorphic markers for the FAP locus. The genetic mapping of FER 
was complemented by physical mapping using the YAC clones derived 
from FER sequences (Figure IB). Analysis of YAC eontif 1 showed that 
FER was within 600 kb of cosmid marker M5.2S, which maps to within 
1J Mb of cosmid L5.99 by PFCE of human genomic DNA. Thus, the - 
YAC mapping results were consistent with the FER linkage data and 
PFCE analyses. 

Contig 2: TBI - TBI was identified through a cross-hybridization 
approach. Exons of genes are often evoluttonarily conserved while 
lntrons and intergenlc regions are much less conserved. Thus, if a 
human probe cross-hybridizes strongly to the DNA from non-primate 
species, there is a reasonable chance that it contains exon sequences. 
Subclones of the cosmids shown in Figure l were used to screen South- 
ern blots containing rodent DMA samples. A subclone of cosmid K5.66 
(p 5.66-4) was shown to strongly hybridize to rodent DNA, and this 
clone was used to screen cDHA libraries derived from normal adult 
colon and fetal liver. The ends of the initial cDNA clones obtained in 
this screen were then used to extend the cDNA sequence. Eventually, 
11 cDHA clones were isolated, covering 2314 bp. The gene detected by 
these clones was named TBI. Sequence analysis Of the overlapping 
clones revealed an open reading frame (ORF) that extended for 1302 bp 
starting from the most 5* sequence data obtained (Figure 2A). If this 
entire open reading frame were translated, it would encode 434 amino 
adds. The product of this gene was not globally homologous to any 
other sequence in the current database but showed two significant local * 
similarities to a family of ADP, ATP carrier/ transloca tor proteins and 
mitochondrial brown fat uncoupling proteins which are widely distrib- 
uted from yeast to mammals. These conserved regions of TBI 
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(underlined in Figure 2A) may define a predictive motif for this 
sequence family. In addition, TBI appeared to contain a signal peptide 
(or mitochondrial targeting sequence) as well as at least 7 
transmembrane domains. 

Contig S: MCC 9 TB2, SRP and APC - The MCC gene was also 
discovered through a cross-hybridization approach, as described previ- 
ously (Kinder et al.. Science VoL 251, p. 1366 (1991)). The MCC gene 
was considered a candidate for causing FAP by virtue of its tight 
genetic linkage to FAP susceptibility and its somatic mutation in spo- 
radic colorectal carcinomas. However, mapping experiments suggested 
that the coding region of MCC was approximately 50 kb proximal to 
the centromeric end of a 200 kb deletion found in an FAP patient, . 
MCC cONA probes detected a 10 kb mRNA transcript on Northern blot 
analysis of which 4151 bp, including the entire open reading frame, 
have been cloned. Although the 3' non-translated portion or an alter* 
natively spliced form of MCC might have extended Into this deletion, It 
was possible that the deletion did not affect the MCC gene product. 
We therefore used MCC sequences to Initiate a YAC contig, and subse- 
quently used the YAC clones to identify genes 50 to 250 kb distal to 
MCC that might be contained within the deletion. 

In a first approach, the insert from YAC24ED6 (Figure IB) was 
radiolabelled and hybridized to a cONA library from normal colon. One 
of the cDNA clones (YS39) Identified in this manner detected a 3.1 kb 
mRNA transcript when used as a probe for Northern Wot hybriditation. 
Sequence analysis of the YS39 clone revealed that It encompassed 2263 
nucleotides and contained an ORF that extended for 555 bp from the 
most 5* sequence data obtained. If all of this ORF were translated, it 
would encode 185 amino acids (Figure 2B). The gene detected by Y539 
was named TB2. Searches of nucleotide and protein databases revealed 
that the TB2 gene was not identical to any previously reported 
sequences nor were there any striking similarities. 

Another clone (YSU) identified through the YAC 24ED6 screen 
appeared to contain portions of two distinct genes. Sequences from 
one end of YSU were identical to at least 130 bp of the signal recogni- 
tion particle protein SRP19 (Ungelbach et al. Nucleic Acids Research, 
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Vol. 16, p. 9431 (1988). A second ORF, from the opposite end of clone 
YSll, proved to be Identical to 78 bp of a novel gene which was inde- 
pendently identified through a second YAC-based approach. For the 
latter, DKA from yeast cells containing YAC HFH1 (Figure IB) was 
digested with EcoRI and subdoned into a plasmid vector. Plasmids that 
contained human DNA fragments were selected by colony hybridization 
using total human DKA as a probe. These clones were then used to 
search for cross-bybridiring sequences as described above for TBI, and 
the cross-hybridizing clones were subsequently used to screen cDNA 
libraries. One of the cDNA clones discovered In this way (FHJ8) con- 
tained a long ORF (2496 bp), 78 bp of which were Identical to the 
above-noted sequences in YSll. The ends of the FH38 cDNA clone 
were then used to Initiate cDNA walking to extend the sequence. 
Even tually , 85 cDNA clones were Isolated from norma) colon, brain and 
liver cDNA libraries and found to encompass 8973 nucleotides of con- 
tiguous transcript. The gene . corresponding to this transcript was 
named APC When used as probes for Northern blot analysis, A PC 
cDNA clones hybridized to a single transcript of approximately 9.S kb, 
suggesting that the great majority of the gene product was represented 
in the cDNA clones obtained. Sequences from the 5* end of the APC 
gene were found in YAC 37HG4 but not in YAC 14FHI. However, the 
3* end of the APC jene was found in 14FH1 as well as 37HG4. The 
yeast artificial chromosome of the present invention designated 
YAC 37HG4 has been deposited with the National Collection of Indus- 
trial and Marine Bacteria (NCIMB), P.O. Box 31, 135 Abbey Road, 
Aberdeen AB9 8DG, Scotland, prior to the filing of this patent applica- 
tion. The NCIMB Accession Number of YAC clone YAC 37HG4 is 
40353. Analogously, the 5' end of the MCC coding region was found in 
YAC clones 19AA9 and 26GC3 but not 24ED6 or 14FH1, while the 3' 
end displayed the opposite pattern. Thus, MCC and APC transcription 
units pointed in opposite directions, with the direction of transcription 
going from centromeric to telomeric in the case of MCC, and telomeric 
to centromeric in the case of APC. PFGE analysis of YAC DNA 
digested with various restriction endonucleases showed that TB2 and 
SRP were between MCC and APC, and that the 3* ends of the coding 
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regions of MCC and A PC were separated by approximately ISO kb 
(Figure IB). 

Sequence analysis of the APC cDNA clones revealed an open 
reading frame of 8,535 nucleotides. The 5' end of the ORF contained a 
methionine codon (codon 1) that was preceded by an in-frame stop 
codon 9 bp upstream, and the 3' end was followed by several in-frame 
stop codons. The protein produced by initiation at codon 1 would con- 
tain 2,842 amino acids (figure 3). The results of database searching 
with the APC gene product were quite complex due to the presence of 
large segments with locally biased amino acid compositions* In spite of 
this, APC could be roughly divided into two domains. The N-terminal 
25% of the protein had a high content or leucine residues (12%) and 
showed local sequence similarities to myosins, various intermediate 
filament proteins (e.g., desmin, vimentin, neurofilaments) and 
Drosophila armadillo/human plakoglobin. The latter protein is a com* 
ponent of adhesive junctions (desmosomes) Joining epithelial cells 
(Franke et al., Proc. Natl. Acad. Sci. U.S.A., Vol. 86. p. 402? (1989); 
Perfer et al M Cell, VoL 63, p. 1167 (1990)) The C-terminai 75% of APC 
(residues 731-2832) is 17% serine by composition with serine residues 
more or less uniformly distributed. This large domain also contains 
local concentrations of charged (mostly acidic) and proline residues. 
There was no indication of potential signal peptides, transmembrane 
regions, or nuclear targeting signals in APC suggesting a cytoplasmic 
localization. 

To detect short similarities to APC, a database search was per- 
formed using the PAM-40 matrix (Altschul. J. Mol. Bio., Vol. 219, p. 555 
(1991). Potentially interesting matches to several proteins were found. 
The most suggestive of these involved the ral2 gene product of yeast, 
which Is implicated in the regulation of ras activity (Fukul et al., Mol. 
Cell. Biol.. Vol. 9, p. 5617 (1989)). Little is known about how ra!2 might 
interact with ras but it is interesting to note the positively-charged 
character of this region in the context of the negatively-charged GAP 
interaction region of ras. A specific electrostatic interaction between 
ras and GAP-related proteins has been proposed. 
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Because of the proximity of the MCC and APC genes, and the 
fact that both are implicated in colorectal tumorigenesis, we searched 
for similarities between the two predicted proteins. Bourne has previ- 
ously noted that MCC has the potential to form alpha helical coiled 
colls (Nature. Vol. 351, p. 188 (1991). Lupas and colleagues have 
recently developed a program for predicting colled coil potential from 
primary sequence data (Science. Vol. 252. p. 1162 (1991) and we have 
used their program to analyze both MCC and APC. Analysis of MCC 
Indicated a discontinuous pattern of eoiled-coil domains separated by 
putative -hinge" or "spacer" regions similar to those seen in lamlnin 
and other intermediate filament proteins. Analysis of the APC 
sequence revealed two regions in the N-terminal domain which had 
strong coiled coil-forming potential, and these regions corresponded to 
those that showed local similarities with myosin and IF proteins on 
database searching. In addition, one other putative coiled coil region 
was identified in the central region of APC. The potential for both 
APC and MCC to form colled colls is interesting in that such structures 
often mediate homo- and hetero-oligomeriaation. 

Finally, it had previously been noted that MCC shared a short 
similarity with the region of the m3 muscarinic acetylcholine receptor 
(mAChR) known to regulate specificity of G-proteln coupling. The 
APC gene also contained a local similarity to the region of the mS 
mAChR that overlapped with the MCC similarity (Figure 4B). Although 
the similarities to ral2 (Figure 4A) and m3 mAChR (Figure 4B) were not 
statistically significant, they were intriguing in light of previous obser- 
vations relating C-proteins to neoplasia. 

Each of the six genes described above was expressed In normal 
colon mucosa, as Indicated by their representation in colon cDNA 
libraries. To study expression of the genes In neoplastic colorectal 
epithelium, we employed reverse transcriptlon-polymerase chain reac- 
tion (PCR) assays. Primers based on the sequences of FER, TBI, TB2, 
MCC, and APC were each used to design primers for PCR performed 
with cDNA templates. Each of these genes was found to be expressed 
in normal colon, in each of ten cell lines derived from colorectal can- 
cers, and In tumor cell lines derived from lung and bladder tumors. The 
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ten colorectal cancer ceil lines included eight from patients with spo- 
radic CRC and two from patients with FAP. 
Example 2 

This example demonstrates a genetic analysis of the role of the 
FER gene in FAP and sporadic colorectal cancers. 

We considered FER as a candidate because of its proximity to 
the FAP locus as judged by physical and genetic criteria (see 
Example 1), and Its homology to known tyrosine kinases with oncogenic 
. potential. Primers were designed to PCR-amplif y the complete coding 
sequence of FER from the RNA of two colorectal cancer cell lines 
derived from FAP patients- cDNA was generated from RNA and used 
as a template for PCR. The primers used were 
5'-AGAACGATCCCTTGTGCAGTGTCGA-3 f and 
5'-GACAGSATCCTGAACCTGAGTTTG-3\ The underlined nucleotides 
were altered from the true FER sequence to create BamKI sites. The 
cell lines used were JW and Difi, both derived from colorectal cancers 
of FAP patients. (C. Paraskeva, B.G. Buckle, D. Sheer, C.B. wigiey, 
int. J. Cancer 34 9 49 (1984); M.E. Gross et al., Cancer Res. 51, 1452 
(1991). The resultant 2554 basepair fragments were cloned and 
sequenced in their entirety. The PCR products were cloned in the 
BamHl site of Bluescripc SX (Stratagene) and pools of at least 50 clones 
were sequenced en masse using T7 polymerase, as described in Nigra 
et al. 9 Nature 342, 705 (1989). 

Only a single conservative amino acid change (GTG->CTG, cre- 
ating a val to leu substitution at codon 439) was observed. The region 
surrounding this codon was then amplified from the DNA of individuals 
without FAP and this substitution was found to be a common 
polymorphism, not specifically associated with FAP. Based on these 
results, we considered it unlikely (though still possible) the FER gene 
was responsible for FAP. To amplify the regions surrounding codon 
439, the following primers were used: 5 U 7CAGAAAGTGCTGAAGAG*3* 
and 5'-GGAATAATTAGGTCTCCAA-3'; PCR products were digested 
with PstI, which yields a 50 bp fragment if codon 439 is leucine, but 26 
and 24 bp fragments if it is valine. The primers used for sequencing 
were chosen from the FER cDNA sequence in Hao et al. f supra . 
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Example 3 

This example demonstrates the genetic analysis of MCC, TB2, 
SRP and APC in FAP and sporadic colorectal tumors. Each of these 
genes is linked and encompassed by contig 3 (see Figure 1). 

Several lines of evidence suggested that this contig was of par- 
ticular interest. First, at least three of the four genes in this contig 
were within the deleted region identified in two FAP patients. (See 
Example 5 infra-) Second, allelic deletions of chromosome Sq21 in spo- 
radic cancers appeared to be centered in this region. (Ashtorr-Rickardt 
et el., Oncogene, in press; and MOd et aL, Japn. J. Cancer Res., In 
press.) Some tumors exhibited loss of proximal RFLP markers (up to 
and potentially including the 5* end of MCC), but no loss of markers 
distal to MCC. Other tumors exhibited loss of markers distal to and 
perhaps including the 3' end of MCC, but no loss of sequences proximal 
to MCC. This suggested either that different ends of MCC were 
affected by loss In all such cases, or alternatively, that two genes (one 
proximal to and perhaps including MCC, the other distal to MCC) were 
separate targets of deletion. Third, clones from each of the six FAP 
region genes were used as probes on Southern blots containing tumor 
DNA from patients with sporadic CRC. Only two examples of somatic 
changes were observed in over 200 tumors studied: a 
rearrangement/deletion whose centromeric end was located within the 
MCC gene (Kinzler et al., supra ) and an 800 bp Insertion within the 
APC gene between nucleotides 4424 and SS84. Fourth, point mutations 
of MCC were observed in two tumors (Kinzler et al.) supra strongly 
suggesting that MCC was a target of mutation in at least some sporadic 
colorectal cancers. 

Based on these results, we attempted to search for subtle alter- 
ations of contig 3 genes in patients with FAP. We chose to examine 
MCC and APC, rather than TB2 or SRP. because of the somatic muta- 
tions in MCC and APC noted above. To facilitate the identification of 
subtle alterations, the genomic sequences of MCC and APC exons were 
determined (see Table 1). These sequences were used to design primers 
for PCR analysis of constitutional DNA from FAP patients. 
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We first amplified eight exons and surrounding introns of the 
MCC gene In affected Individuals from 90 different PAP kindreds. The 
PCR products were analyzed by a ribonuclease (RNase) protein assay. 
In brief, the PCR products were hybridized to in vitro transcribed RNA 
probes representing the normal genomic sequences. The hybrids were 
digested with RNase A, which can cleave at single base pair mis* 
matches within DNA-RNA hybrids, and the cleavage products were 
visualized following denaturing gel electrophoresis. Two separate 
RNase protection analyses were performed for each exon, one with the 
sense and one with the antisense strand. Under these conditions, 
approximately 40% of all mismatches are detectable. Although some • 
amino add variants of MCC were observed in FAP patients, ail such 
variants were found in a small percentage of normal individuals. These 
variants were thus unlikely to be responsible for the inheritance of 
FAP. 

We next examined three exons of the APC gene. The three 
exons examined included those containing nt 822-930, 931-1309, and 
the first 300 nt of the most distal exon (nt 1956-2256). PCR and RNase 
protection analysis were performed as described in Kinzler et al. supra , 
using the primers underlined in Table I. The primers for nt 1956-2256 
were 5«-GCAAATCCTAAGACAGAACAA-3« and 

5'-GATGGCAAGCTTGAGCCAG-3'. 

In 90 kindreds, the RNase protection method was used to screen 
for mutations and in an additional 13 kindreds, the PCR products were 
cloned and sequenced to search for mutations not detectable by RNase 
protection. PCR products were cloned into a Bluescript vector modi- 
fied as described in T.A. Holton and M.W. Graham. Nucleic Acids Res. 
19, 1136 (1991). A minimum of 100 clones were pooled and sequenced. 
Five variants were detected among the 103 kindreds analyzed. Cloning 
and subsequent ONA sequencing of the PCR product of patient P21 
Indicated a C to T transition in codon 413 that resulted In a change 
from arginine to cysteine. This amino add variant was not observed in 
any of 200 DNA samples from individuals without FAP. Cloning and 
sequencing of the PCR product from patients P24 and P34, who demon- 
strated the same abnormal RNase protection pattern indicated that 
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both bad a C to T transition at codon 301 that "resulted In a change 
from arginine (CCA) to a stop codon (TGA). This change was not 
present In 200 individuals without FA P. As this point mutation resulted 
in the predicted loss of the recognition site for the enzyme Taq I, 
appropriate PCR products could be digested with Taq I to detect the 
mutation. This allowed us to determine that the stop codon 
co-segregated with disease phenotype in members of the family of P24. 
The inheritance of this change In affected members of the pedigree 
provides additional evidence for the importance of the mutation. 

Cloning and sequencing of the PCR product from FAP patient 
P93 Indicated a C to G tra reversion at codon 279, also resulting in a 
stop codon (change from TCA to TGA). This mutation was not present 
In 200 Individuals without FAP. Finally, one additional mutation result- 
ing in a serine (TCA) to stop codon (TGA) at codon 712 was detected in 
a single patient with FAP (patient P60). 

The five germline mutations Identified are summarized in 
Table DA, as well as four others discussed in Example 9. In addition to 
these germline mutations, we identified several somatic mutations of 
MCC and APC in sporadic CRCs. Seventeen MCC exons were exam- 
ined in 90 sporadic colorectal cancers by RNase protection analysis. In 
each case where an abnormal RNase protection pattern was observed, 
the corresponding PCR products were cloned and sequenced. This led 
to the identification of six point mutations (two described previously) 
(Xinzler et aL 9 suora ). each of which was not found in the germline of 
these patients (Table HB). Four of the mutations resulted in amino acid 
substitutions and two resulted in the alteration of splice site consensus 
elements. Mutations at analogous splice site positions in other genes 
have been shown to alter RNA processing in vivo and in vitro . 

Three exons of APC were also evaluated in sporadic tumors. 
Sixty tumors were screened by RNase protection, and an additional 98 
tumors were evaluated by sequencing. The exons examined included nt 
822-930, 931-1309, and 1406-1343 (Table I). A total of three mutations 
were identified, each of which proved to be somatic. Tumor T27 con- 
tained a somatic mutation of CGA (arginine) to TGA (stop codon) at 
codon 33. Tumor T13S contained a GT to GC change at a splice donor 
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Site. Tumor T34 contained a 5 bp Insertion (CAGCC between eodons 
288 and 289) resulting in a stop at codon 291 due to a (rameshif t. 

We serendipitously discovered one additional somatic mutation in 
a colorectal cancer. During our attempt to define the sequences and 
splice patterns of the MCC and APC gene products in colorectal 
epithelial cells, we cloned cDNA from the colorectal cancer cell line 
SW430. The amino acid sequence of the MCC gene from SW480 was 
Identical to that previously found in clones from human brain. The 
sequence of APC in SWI80 cells, however, differed significantly, in 
that a transition at codon 1338 resulted in a change from glutamine 
(CAG) to a stop codon (TAG). To determine if this mutation was 
somatic, we recovered DNA from archival paraffin blocks of the origi- 
nal surgical specimen (T201) from which the tumor cell line was 
derived 28 yean ago. 

DNA was purified from paraffin sections as described tn S.E. 
Goelz, 5.R. Hamilton, and B. Vogelstein. Biochem. fiiophys. Res. 
Comm. 130, 118 (1985). PCR was performed as described in reference 
24, using the primers S'-GTTCCAGCAGTGTCACAG-S' and 
5 , -GGOAGATTTCOCTCCTGA-3\ A PCR product containing codon 
1338 was amplified from the archival DNA and used to show that the 
stop codon represented a somatic mutation present in the original pri- 
mary tumor and in cell lines derived from the primary and metastatic 
tumor sites, but not from normal tissue of the patient. 

The ten point mutations in the MCC and APC genes so tar dis- 
covered in sporadic CRCs are summarized in Table OB. Analysis of the 
number of mutant and wild-type PCR clones obtained from each of 
these tumors showed that in eight of the ten cases, the wild-type 
sequence was present in approximately equal proportions to the 
mutant. This was confirmed by RFLP analysis using flanking markers 
from chromosome Sq which demonstrated that only two of the ten 
tumors (T135 and T201) exhibited an allelic deletion on chromosome Sq. 
These results are consistent with previous observations showing that 
20-40% of sporadic colorectal tumors had aiieUc deletions of cnromo- 
some Sq. Moreover, these data suggest that mutations of 3q2l genes 
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arc not limited to those colorectal tumors which contain allelic dele- 
tions ol this chromosome. 
Example 4 

This example characterizes small, nested deletions In DNA from 
two unrelated FAP patients. 

DNA from 40 FAP patients was screened with cosmlds that had 
been mapped Into a region near the APC locus to Identify small dele- 
tions or rearrangements. Two of these cosmlds, LS.71 and LS.79, 
hybridized with a 1200 Kb NotI fragment In DMAs from most of the FAP 
patients screened. 

The DNA of one FAP patient, 3214, showed only a 940 kb NotI 
fragment instead of the expected 1200 kb fragment. DNA was ana- 
lyzed from four other members of the patient* immediate family; the 
940 Kb fragment was present in her. affected mother (4711), but not in 
the other, unaffected family members. The mother also carried a nor- 
mal 1200 kb Not! fragment that was transmitted to her two unaffected 
offspring. These observations indicated that the mutant polyposis 
allele is on the same chromosome as the 940 kb NotI fragment. A sim- 
ple interpretation is that APC patients 3214 and 4711 each carry a 260 
kb deletion within the APC locus. • 

If a deletion were present, then other enzymes might also be 
expected to produce fragments with altered mobilities. Hybridization 
of LS.79 to Nrul-digested DNAS from both affected members of the 
family revealed a novel Nrul fragment of 1300 kb, in addition to the 
normal 120Q kb Nnd fragment. Furthermore. Mlul fragments in 
patients 3214 and 4711 also showed an increase in size consistent with 
the deletion of an Mlul site. The two chromosome 5 homologs of 
patient 3214 were segregated in somatic ceil hybrid lines; HHKH55 
(deletion hybrid) carried the abnormal homolog and HHW1159 (normal 
hybrid) carried the normal homolog. 

Because patient 3214 showed only a 940 kb NotI fragment, she 
had not inherited the 1200 kb fragment present in the unaffected 
fathers DNA. This observation suggest* that he must be heterozygous 
for, and have transmitted, either a deletion of the LS.79 probe region 
or a variant NotI fragment too large to resolve on the gel system. As 
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expected, the hybrid cell line HHW1159, which carries the paternal 
homolog, revealed no resolved Not fragment when probed with L5.79. 
However, probing of HHW1159 DMA with 15.79 following digestion with 
other enzymes did reveal restriction fragments, demonstrating the 
presence of DNA homologous to the probe. The father is, therefore, 
interpreted as heterozygous for a polymorphism at the NotI site, with 
one chromosome 5 having a 1200 kb Notl fragment and the other hav- 
ing a fragment too large to resolve consistently on the gel. The latter 
was transmitted to patient 3214. 

When double digests were used to order restriction sites within 
the 1200 kb Notl fragment, L5.71 and 13.79 were both found to lie on a 
550 kb Notf-Nrul fragment and, therefore, on the same side of an Nrul 
site in the 1200 kb Notl fragment. To obtain genomic representation of 
sequences present over the entire 1200 kb Notl fragment, we con- 
structed a library of small-fragment inserts enriched for sequences 
from this fragment. DNA from the somatic cell hybrid KRW141, which 
contains about 40% of chromosome 5, was digested with Notl and 
electrophoreses under pulsed-fieM gel (PFG) conditions; EcoRl frag- 
ments from the 1200 kb region of this gel were cloned into a phage 
vector. Probe Map30 was isolated from this library. In normal individ- 
uals probe Map30 hybridizes to the 1200 kb Notl fragment and to a 200 
kb Nrul fragment. This latter hybridization places MapSO distal, with 
respect to the locations of L5.71 and L5.79, to the Nrul site of the 550 
kb Notl-Nrul fragment. 

Because Map30 hybridized to the abnormal, 1300 kb Nrul frag- 
ment of patient 3214, the locus defined by Map30 lies outside the 
hypothesized deletion. Furthermore, in normal chromosomes Map30 
identified a 200 kb Nrul fragment and L5.79 identified a 1200 kb Nrul 
fragment; the hypothesized deletion must, therefore, be removing an 
Nrul site, or sites, lying between MapSO and L5.79, and these two 
probes must flank the hypothesized deletion. A restriction map of the 
genomic region, showing placement of these probes, is shown in 
Figures. 

A Notl digest of DNA from another FAP patient, 3824, was 
probed with L5.79. In addition to the 1200 kb normal Notl fragment, a 
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fragment of approximately 1100 kb was observed._consisrent with the 
presence of a 100 kb deletion in one chromosome 5. In this case, how- 
ever digestion with Krul and Mlul did not reveal abnormal bands, indi- 
cating that if a deletion were present, its boundaries must lie distal to 
the Nrul and Mlul sites of the fragments identified by 15.79. Consis- 
tent with this expectation, hybridization of Map30 to DNA from 
patient 3824 identified a 760 kb Mlul fragment in addition to the 
expected 860 kb fragment, supporting the interpretation of a 100 kb 
deletion in this patient. The two chromosome 5 homelogs of patient 
3824 were segregated in somatic cell hybrid liner, HHW1291 was found 
to carry only the abnormal homolog and HHW1290 only the normal 
homolog. 

That the 860 kb Mlul fragment Identified by Map30 is distinct 
from the 830 kb MM fragment Identified previously by LS.79 was dem- 
onstrated by hybridization of MapJO and LS.79 to a NothMluI double 
digest of DNA from the hybrid cell (HHWUS9) containing the 
undeleted chromosome S homolog of patient 3214. As previously indi- 
cated, this hybrid is Interpreted as missing one of the NotI sites that 
define the 1200 kb fragment. A 620 kb Notl-Miui fragment was seen 
with probe LS.79, and an 860 kb fragment was seen with Map30. 
Therefore, the 830 kb Mlul fragment recognized by probe LS.79 must 
contain a NotI site in HHWU59 DNA; because the 860 kb Mlul fragment 
remains Intact. It does not carry this NotI site and must be distinct 
from the 830 kb Mlul fragment. 
Example 3 

This example demonstrates the isolation of human sequences 
which span the region deleted in the two unrelated FAP patients char- 
acterized in Example 4. 

A strong prediction of the hypothesis that patients 3214 and 
3824 carry deletions is that some sequences present on normal chromo- 
some S homoiogs would be missing from the hypothesized deletion 
homologs. Therefore, to develop genomic probes that might confirm 
the deletions, as well as to identify genes from the region, YAC clones 
from a eontig seeded by cosmid LS.79 were localized from a library 
containing seven haploid human genome equivalents (Albertsen et al., 
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Proc. Natl. Acad. Sd. U.S.A., Vol. 87, pp. 4256-4260 (1990)) with 
respect to the hypothesized deletions. Three clones, YACs 57B8. 
310D8, and 183H12, were round to overlap the deleted region. 

Importantly, one end of YAC S7B8 (clone AT57) was found to lie 
within the patient 3214 deletion. Inverse polymerase chain reaction 
(PCB) defined the end sequences of the insert of YAC 57B8. PCR 
primers based on one of these end sequences repeatedly failed to 
amplify OKA from the somatic cell hybrid (KHW1135) carrying the 
deleted homolog of patient 3214, but did amplify a product of the 
expected size from the somatic cell hybrid (HHW1159) carrying the 
normal chromosome 5 homolog. This result supported the interpreta- 
tion that the abnormal restriction fragments found in the DNA of 
patient 3214 result from a deletion. 

Additional support for the hypothesis of deletion in DNA from 
patient 3214 came from subcloned fragments of YAC 183H12, which 
spans the region in question, Yll, an EcoRl fragment cloned from 
YAC 183H12, hybridized to the normal, 1200 kb NotI fragment of 
patient 4711, but failed to hybridize to the abnormal. 940 kb NotI frag- 
ment of 4711 or to DNA from deletion cell line KKWU5S. This result 
confirmed the deletion in patient 3214. 

Two additional EcoRl fragments from YAC 183H12, Y10 and 
Y14, were localized within the patient 3214 deletion by their failure to 
hybridizie to DNA from HHWU55. Probe Y10 hybridizes to a ISO kb 
Nrul fragment In normal chromosome S homologs. Because the 3214 
deletion creates the 1300 kb Nrul fragment seen with the probes L5.79 
and Map30 that flank the deletion, these Nrul rites and the 150 kb Nrul 
fragment lying between must be deleted in pauent 3214. Furthermore, 
probe Y10 hybridizes to the same 620 kb NotHMluI fragment seen with 
probe U.79 in normal DNA, indicating its location as 13.79-proxlmal to 
the deleted Mlul site and placing it between the Mlul site and the 
L5.79-proximal Nrul site. The Mlul site must, therefore. He between 
the Nrul sites that define the ISO kb Nrul fragment (see Figure 5). 

Probe Yll also hybridized to the 150 kb Nrul fragment In the 
normal chromosome S homolog, but tailed to hybridize to the 620 kb 
Notl-Mlul fragment, placing it L3.7»-dlstal to the Mlul site, but 
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proximal to the second Nrul Site. Hybridization to the same (860 kb) 
Miul fragment as Map30 confirmed the localization of probe Yll 

L5.79-dlstal to the Miul site. 

Probe YH was shown to be 15.73-distal to both deleted Nrul 
sites by virtue of its hybridization to the same 200 kb Nrul fragment of 
the normal chromosome 5 seen with MapSO. Therefore, the order of 
these EeoRI fragments derived from YAC 183H12 and deleted in 
patient 3214. with respect to 15.79 and Map30. is 
L5.79-Yl0-Yll-Y14-Map30. 

The 100 Kb deletion of patient 3824 was confirmed by the failure 
ot aberrant restriction fragments in this ONA to hybridize with probe 
Yll. combined with positive hybridizations to probes Y10 and/or Yi4. 
Y10 and Y14 each hybridized to the 1100 KD NotI fragment of patient 
3824 as well as to the normal 1200 Kb NotI fragment, but Yll hybrid- 
ized to the 1200 kb fragment only. In the Miul digest, probe Y14 
hybridized to the 860 kb and 760 Kb fragments of patient 3824 DNA, but 
probe Yll hybridized only to the 860 kb fragment. We conclude that 
the basis for the alteration in fragment size in DNA from patient 3824 
is. indeed, a deletion. Furthermore, because probes Y10 and Y14 are 
missing from the deleted 3214 chromosome, but present on the deleted 
3824 chromosome, and they have been shown to flank probe Yll. the 
deletion in patient 3824 must be nested within the patient 3214 
deletion. 

Probes Y10, Yll, Y14 and MapSO each hybridized to YAC S10D8. 
indicating that this YAC spanned the patient 3824 deletion and at a 
minimum, most of the 3214 deletion. The YAC characterizations, 
therefore, confirmed the presence of deletions in the patients and pro- 
vided physical representation of the deleted region. 
Example 6 

This example demonstrates that the MCC coding sequence maps 
outside of the region deleted in the two FAP patients characterized in 
Example 4. 

An intriguing FAP candidate gene. MCC. recently was ascer- 
tained with cosmid LS.71 and was shown to have undergone mutation in 
colon carcinomas (Kinzier et al.. safira). It was therefore of Interest to 
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map this gene with respect to the deletions In APC patients. Hybrid- 
ization of MCC probes with an overlapping series of YAC clones 
extending in either direction from L5.71 showed that the 3' end of MCC 
must be oriented toward the region of the two APC deletions. 

Therefore, two 3' cDNA clones from MCC were mapped with 
respect to the deletions: clone 1CI (bp 2378-4181) and clone 7 (bp 
2890-3560). Clone 1C1 contains sequences from the C-terminal end of 
the open reading frame, which stops at nucleotide 2708, as well as 3* 
untranslated sequence. Clone 7 contains sequence that Is entirely 3' to 
the open reading frame. Importantly, the entire T untranslated 
sequence contained in the cDHA clones consists of a single 2.5 kb exon. 
Thee two clones were hybridized to DN as from the YACs spanning the 
FAP region. Clone 7 fails to hybridize to YAC 310D8. although it does 
hybridize to YACs 1S3H12 and 37Bt; the same result was obtained with 
the cDNA 1CI. Furthermore, these probes did show hybridization to 
DNAs from both hybrid cell lines (Kwwixs9 and Hwwuss) and the 
lymphobiastoid cell line from patient 3214, confirming their locations 
outside the deleted region. Additional mapping experiments suggested 
that the 3 ! end of the MCC cDNA clone contig is likely to be located 
more than 43 kb from the deletion of patient 3214 and, therefore, more 
than 100 kb from the deletion of patient 3824. 
Example 7 

This example identifies three genes within the deleted region of 
chromosome 5 in the two unrelated FAP patients characterized in 
Example 4. 

Genomic clones were used to screen cDNA libraries In three 
separate experiments. One screening was done with a phage clone 
derived from YAC 310D8 known to span the 260 kb deletion of patient 
3214. A large-insert phage library was constructed from this YAC; 
screening with Yll identified X205, which mapped within both dele- 
tions. When clone X20S was used to probe a random-, plus oligofdTK 
primed fetal brain cDNA library (approximately 300,000 phage), six 
cdna clones were isolated and each of them mapped entirely within 
both deletions. Sequence analysis of these six clones formed a single 
cDNA contig, but did not reveal an extended open reading frame. One 
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01 the six cDNAS was used ro Isolate more cDNA clones, some of which 
crossed the LS.71-proximal breakpoint of the 3824 deletion, as indi- 
cated by hybridization to both chromosome of this patient. These 
clones also contained an open reading frame, Indicating a transcrip- 
tional orientation proximal to distal with respect to L5.71. This gene 
was named DPI (deleted In polyposis l). This gene is identical to TB2 

described above. . 

cDKA walics yielded a cDNA contig of 3.0-3.5 kb, and included 
two clones containing terminal polyU) sequences. This size corre- 
sponds to the 3.5 Kb band seen by Northern analysis. Sequencing of the 
first 3163 bp of the cDNA contig revealed an open reading frame 
extending from the first base to nucleotide 631. followed by a 2.3 kb 3' 
untranslated region. The sequence surrounding the methionine codon 
at base 77 conforms to the Kozak consensus of an initiation methionine 
(Kozak, 1964). Failed attempts to walk farther, coupled with the simi- 
larity of the lengths of Isolated cDNA and mRNA, suggested that the 
NH 2 -terminus of the DPI protein had been reached. Hybridization to a 
combination of genomic and YAC DNAs cut with various enxymes indi- 
cated the genomic coverage of DPI to be approximately 30 kb. 

Two additional probes for the locus, YS-ll and YS-39, which had 
been ascertained by screening of a cDKA library with an independent 
YAC probe identified with MCC sequences adjacent to L5.71. were 
mapped Into the deletion region. YS-39 was shown to be a cDNA iden- 
tical in sequence to DPI. Partial characterization of YS-ll had shown 
that 200 bp of DKA sequence at one end was identical to sequence cod- 
ing for the 19 kd protein of the ribosomal signal recognition particle, 
SRP19 (Lingelbach et aL, supra ). Hybridlaation experiments mapped 
YS-ll within both deletions. The sequence of this clone, however, was 
found to be complex. Although 454 bp of the 1032 bp sequence of 
YS-ll were identical to the GenBank entry for the SRP19 gene, 
another 578 bp appended 5' to the SRP19 sequence was found to consist 
of previously unreported sequence containing no extended open reading 
frames. This suggested that YS-ll was either a chimeric clone con- 
taining two independent inserts or a clone of an incompletely processed 
or aberrant message. If YS-ll were a conventional chimeric clone, the 
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Independent segments would not be expected to map to the same physi- 
cal region. The segments resulting from anomalous processing of a 
continuous transcript, however, would map to a single chromosomal 
region. 

Inverse PCR with primers specific to the two ends of YS-11. the 
SRP19 end and the unidentified region, verified that both sequences 
map within the YAC 310D8; therefore, YS-li is most likely a clone of 
an Immature or anomalous mRNA species. Subsequently, both ends 
were shown to lie with the deleted region of patient 3824, and YS-11 
was used to screen for additional cDKA clones. 

Of the 24 cDNA clones selected from the fetal brain library, one 
clone, V5, was of particular interest in that it contained an open read- 
ing frame throughout, although it included only a short identity to the 
first 78 5* bases of the YS-ll sequence. Following the 78 bp of identi- 
cal sequence, the two cDNA sequences diverged at an AG. Further* 
more, divergence from genomic sequence was also seen after these 78 
bp, suggesting the presence of a splice Junction, and supporting the 
view that YS-11 represents an irregular message. 

Starting with VS, successive 5' and 3* walks were performed; the 
resulting cDNA contig consisted of more than 100 clones, which 
defined a new transcript, DP 2. Clones walking in the 3' direction 
crossed the 3824 deletion breakpoint farthest from 15.71; since its 3' 
end is closer to this cosmid than its 5 1 end. the transcriptional orienta- 
tion of DP2 is opposite to that of MCC and DPI. 

The third screening approach relied on hybridization with a 120 
kb Mlul fragment from YAC 57B8. This fragment hybridizes with probe 
Yll and completely spans the 100 kb deletion in patient 3824. the 
fragment was purified on two preparative PFGs, labeled, and used to 
screen a fetal brain cDNA library. A number of cDNA clones previ- 
ously identified in the development of the DPI and DP2 contigs were 
reascertained. However, 19 new cDKA clones mapped into the patient 
3824 deletion. Analysis indicated that these 19 formed a new contig, 
DPS, containing a large open reading frame. 

A clone from the 5 1 end of this new cDNA contig hybridized to 
the same EcoRI fragment as the 3' end of DP2. Subsequently, the DP2 
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and DP3 contigs were connected by a single 5' walking step from DP3, 
to form the single contig DP2.3. The complete nucleotide sequence of 
DP2.Slsshownin Figures. 

The consensus cDNA sequence or DP2 J suggests that the entire 
coding sequence of DP2J has been obtained and is 8S32 bp long. The 
most 5' ATC eodoo occurs two codons from an in-frame stop and con- 
forms to the Kozak Initiation consensus (Kozak, NucL Acids. Res., 
VoL 12, p. 857-872 1984). The 3' open reading frame breaks down over 
the final 1.8 kb, giving multiple stops in all frames. A poly(A) sequence 
was found in one clone approximately I kb Into the 3' untranslated 
region, associated with a polyadenylation signal 33 bp upstream (posi- 
tion 9330). The open reading frame is almost Identical to that identi- 
fied as APC above. 

An alternatively spliced exon at nucleotide 934 of the DP2.S 
transcript is of potential interest, it was first discovered by noting 
that two classes of cDNA had been isolated. The more abundant cDNA 
class contains a 303 bp exon not included in the other. The presence in 
vivo of the two transcripts was verified by an exon connection experi- 
ment. Primers flanking the alternatively spliced exon were used to 
amplify, by PCR, cDKA prepared from various adult tissues. Two PCR 
products that differed in size by approximately 300 bases were ampli- 
fied from all the tissues tested; the larger product was always more 
abundant than the smaller. 
Examples 

This example demonstrates the primers used to identify, subtle 
mutations in DPI. SRP19. and DP23. 

To obtain DKA sequence adjacent to t .e exons of the genes DPI, 
DP2J. and SRP19. sequencing substrate was obtained by inverse PCR 
amplification of DKAs from two YACs. 310D8 and 183H12, that span 
the deletions. Ligation at low concentration cyclized the restriction 
enzyme-digested YAC DMAs. Oligonucleotides with sequencing tails, 
designed in inverse orientation at intervals along the cDNAs. primed 
PCR amplification from the cyclized templates. Comparison of these 
DMA sequences with the cDKA sequences placed exon boundaries at 
the divergence points. SRP19 and DPI were each shown to have five 
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exons. DP2.5 consisted of 15 exons. The sequences of the 
oligonucleotides synthesized to provide PCR amplification primers for 
the exons of each of these genes arc listed in Table 111. With the excep- 
tion of exons 1. 3, 4. 9. and 15 of DP2.5 (see below), the primer 
sequences were located in in iron sequences flanking the exons. The 5' 
primer of exon 1 is complementary to the cDNA sequence, but extends 
just into the 5' Kozak consensus sequence for the initiator methionine, 
allowing a survey of the translated sequences. The 5' primer of exon 3 
Is actually in the 5' coding sequences of this exon, as three separate 
intronic primers simply would not amplify. The 3' primer of exon 4 just 
overlaps the 5' end of this exon, and we thus fall to survey the 19 most 
5 f bases of this exon. For exon 9, two overlapping primer sets were 
used, such that each had one end within the exon. For exon 15, the 
large 3 f exon of DP2J, overlapping primer pairs were placed along the 
length of the exon; each pair amplified a product of 250-400 bases, 
f xaqply g 

This example demonstrates the use of single stranded conforma- 
tion polymorphism (SSCP) analysis as described by Orita et aL Proc. 
Natl. Acad. Sci. U.S.A., Vol. 86, pp. 2766-70 (1989) and Genomics, 
Vol. 5, pp. 874*879 (1989) as applied to DPI, SRP19 and DP2.5. 

SSCP analysis Identifies most single* or multiple-base changes in 
DNA fragments up to 400 bases in length. Sequence alterations are 
detected as shifts in electrophoretic mobility of single-stranded DNA 
on nondenaturing acrylamide gels; the two complementary strands of a 
DNA segment usually resolve as two' SSCP conformers of distinct 
mobilities. However, if the sample is from an individual heterozygous 
for a base-pair variant within the amplified segment, often three or 
more bands are seen. In some cases, even the sample from a 
homozygous individual win show multiple bands. Base-pair-change 
variants are Identified by differences in pattern among the DNAs of 
the sample set. 

Exons of the candidate genes were amplified by PCR from the 
DNAs of 61 unrelated FAP patients and a control set of 12 normal indi- 
viduals. The five exons from DPI revealed no unique conformers In the 
FAP patients, although common conformers were observed with exons 
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2 and 3 in some Individuals of both affected and control sets, indicating 
the presence Of DMA sequence polymorphisms. Likewise, none of the 
five exons of SRP19 revealed unique eonformers in DNA from FAP 
patients in the test panel. 

Testing of exons l through M and primer sets A through K of 
exon 16 of the DP2.5 gene, however, revealed variant eonformers spe- 
cific to FAP patients in exons 7, 8. 10, 11. and 15. These variants ware 
in the unrelated patients 3746, 3460. 3827. 8712. and 3731. respectively. 
The PCR-SSCP procedure was repeated lor each of these exons in the 
five affected individuals and In an expanded set of 48 normal controls. 
The variant bands were reproducible in the FAP patients but were not 
observed in any of the control DNA samples. Additional variant con- 
formers in exons 11 and 15 of the DP2 .5 gene were seen; however, each 
of these was found in both the affected and control DNA sets. The five 
sec of eonformers unique to the FAP patients were sequenced to 
determine the nucleotide changes responsible for their altered mobili- 
ties. The normal eonformers from the host Individuals were sequenced 
also. Bands were cut from the dried acrylamide gels, and the DNA was 
eluted. PCR amplification of these DNAS provided template tor 
sequencing. 

The sequences of the unique eonformers from exons 7, 8. 10, and 
11 of DP2.5 revealed dramatic mutations in the DP2.5 gene. The 
sequence of the new mutation creating the exon 7 conlormer in patient 
3746 was shown to contain a deletion of two adjacent nucleotides, at 
positions 730 and 731 in the cDKA sequence (Figure 7). The normal 
sequence at this splice Junction is CAGGGTCA (intronic sequence 
underlined), with the intron-exon boundary between the two repetitions 
of AC. The mutant allele in this patient has the sequence CACCTCA. 
Although this change is at the 5' splice site, comparison with known 
consensus sequences of splice Junctions would suggest that a functional 
splice Junction is maintained. If this new splice Junction were func- 
tional, the mutation would introduce a frameshift that creates a stop 
eodon 15 nucleotides downstream. If the new splice Junction were not 
functional, messenger processing would be significantly altered. 
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To confirm the 2-base deletion, the PCR product from FAP 
patient 3746 and a control DNA were eleetrophoresed on an 
acrylamide-urea denaturing gel. along with the products ol a sequenc- 
ing reaction. The sample from patient 3746 showed two bands differing 
in size by 2 nucleotides, with the larger band identical In mobility to 
the control sample: this result was independent confirmation that 
patient 3746 Is heterozygous lor a 2 bp deletion. 

The unique conformer found In exon 8 of patient 3460 was found 
to carry a C-T transition, at position 904 In the cONA sequence of 
DP2.5 (shown In Figure 7), which replaced the normal sequence of CGA 
with TGA. This point mutation, when read in frame, results in a stop 
codon replacing the normal arginine codon. This single-base change 
had occurred within the context of a CG dimer, a potential hot spot for 
mutation (Barker et al., 1984). 

The conformer unique to FAP patient 3827 in exon 10 was found 
to contain a deletion of one nucleotide (1367. 1368, or 1369) when com- 
pared to the normal sequence found In the other bands on the SSCP gel. 
This deletion, occurring within a set of three Ts, changed the sequence 
from CTTTCA to CTTCA; this 1 base frameshlft creates a downstream 
stop within 30 bases. The PCR product amplified from this patient's 
DNA also was eleetrophoresed on an acrylamide-urea denaturing gel. 
along with the PCR product from a control DNA and products from a 
sequencing reaction. The patient's PCR product showed two bands 
differing by 1 bp in length, with the larger identical in mobility to the 
PCR product from the normal DNA; this result confirmed the presence 
of a 1 bp deletion in patient 3827. 

Sequence analysis of the variant conformer of exon 11 from 
patient 3712 revealed the substitution of a T by a G at position 1500, 
changing the normal tyrosine codon to a stop codon. 

The pair of conformers observed in exon IS of the DP2.5 gene 
for FAP patient 3751 also was sequenced. These conformers were 
found to carry a nucleotide substitution of C to G at position 5253, the 
third base of a valine codon. No amino acid change resulted from this 
substitution, suggesting that this conformer reflects a genetically silent 
polymorphism. 
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The observation of distinct inactivating mutations in the DP2.S 
gene in four unrelated patients strongly suggested that DP2.5 is the 
gene involved in FAP. These mutations are summarized in Table HA. 
Example Jfl 

This example demonstrates that the mutations identified in the 
DP2.5 (APC) gene segregate with the FAP phenotype. 

Patient 3746, described above as carrying an APC allele with a 
frameshif t mutation, is an affected offspring of two normal parents. 
Colonoscopy revealed no polyps in either parent nor among the 
patients three siblings. 

DKA samples from both parents, from the patient's wife, and 
from their three children were examined. SSCP analysis of DKA from 
both of the patient's parents displayed the normal pattern of conform* 
ers for exon 7, as did dna from the patients* wife and one of his off- 
spring. The two other children, however, displayed the same new con- 
lormert as their affected father. Testing of the patient and his parents 
with highly polymorphic VNTR (variable number of tandem repeat) 
markers showed a 99.98% likelihood that they are his biological 
parents. 

These observations confirmed that this novel confonner, known 
to reflect a 2 bp deletion mutation in the DP2.S gene, appeared sponta- 
neously with FAP in this pedigree and was transmitted to two of the 
children of the affected individual. 
Example 11 

This example demonstrates polymorphisms in the APC gene 
which appear to be u» related to disease (FAP). 

Sequencing of variant conformers found among controls as well 
as individuals with APC has revealed the following polymorphisms in 
the APC gene: first, in exon 11, at position 1458, a substitution of T to 
C creating an Rsal restriction stte but no amino acid change; and sec- 
ond, in exon 15, at positions 5037 and 5271, substitutions of A to G and 
C to T, respectively, neither resulting in amino acid substitutions. 
These nucleotide polymorphisms in the APC gene sequence may be 
useful for diagnostic purposes. 
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Example 12 

This example shows the structure of the A PC gene. 

The structure of the APC gene is schematically shown in 
Figure 8* with flanking intron sequences Indicated. 

The continuity of the very large (6.5 kb). most 3' exon in DP2.5 
was shown in two ways. First, inverse PCR with primers spanning the 
entire length of this exon revealed no divergence of the cDKA 
sequence from the genomic sequence. Second, PCR amplification with 
converging primers placed at Intervals along the exon generated prod- 
ucts of the same size whether amplified from the originally isolated 
cDNA, cDNA from various tissues, or genomic template. Two forms of 
exon 9 were found in DP2.5: one is the complete exon; and the other, 
labeled exon 9A, is the result of a splice Into the interior of the exon 
that deletes bases 934 to 1236 in the mRNA and removes 101 amino 
acids from the predicted protein (see Figure 7). 

This example demonstrates the mapping of the FAP deletions 
with respect to the APC exons. 

Somatic cell hybrids carrying the segregated chromosomes 5 
from the 100 kb (HHW1291) and 260 kb (HHWUW) deletion patients 
were used to determine the distribution of the APC genes exons across 
the deletions. DMAs from these cell lines were used as template, along 
with genomic DMA from a normal control, for PCR-based amplification 
of the APC exons. 

PCR analysis of the hybrids from the 260 kb deletion o! patient 
3214 showed that all but one (exon 1) of (he APC exons are removed by 
this deletion. PCR analysis of the somatic ctJ hybrid HHW1291, carry- 
ing the chromosome S homolog with the 100 kb deletion from patient 
3824, revealed that exons l through 9 are present but exons 10 through 
15 are missing. This result placed the deletion breakpoint either 
between exons 9 and 10 or within exon 10. 
Example U 

This example demonstrates the expression or alternately spliced 
APC messenger in normal tissues and in cancer cell lines. 
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Tissues that express the APC gene were identified by PCR 
amplification of cDNA made to mRNA with primers located within 
adjacent APC exons. In addition. PCR primers that flank the alterna- 
tively spliced exon 9 were chosen so that the expression pattern of 
both splice forms could be assessed. Ail tissue types tested (brain, lung, 
aorta, spleen, heart, kidney, liver, stomach, placenta, and colonic 
mucosa) and cultured cell lines (iymphohlasts, HL60, and 
choriocarcinoma) expressed both splice forms of the APC gene. We 
note, however, that expression by lymphocytes normally residing in 
some tissues, including colon, prevents unequivocal assessment of 
' expression. The large mRNA. containing the complete exon 9 rather 
than only exon 9A, appears to be the more abundant message. 

Northern analysis of poiy(Ahselected RNA from lymphoblasts 
revealed a single band of approximately 10 kb, consistent with the size 
of the sequenced cDKA. 
Exampfc IS 

This example discusses structural features of the APC protein 
predicted from the sequence. 

The cDNA consensus sequence of APC predicts that the longer, 
more abundant form of the message codes for a 2842 or 29444 amino 
acid peptide with a mass of 311.8 kd. This predicted APC peptide was 
compared with the current data bases of protein and DNA sequences 
using both Intelligenetics and GCG software packages. No genes with a 
high degree of amino acid sequence similarity were found. Although 
many short (approximately 20 amino add) regions of sequence similar- 
ity were uncovered, none was sufficently strong to reveal which. U 
any, might represent functional homology. Interestingly, multiple simi- 
larities to myosins and keratins did appear. The APC gene also was 
scanned for sequence motifs of known function; although multiple 
glycosylation, phosphorylation, and myristoyiation sites were seen, 
their significance is uncertain. 

Analysis of the APC peptide sequence did identify features 
Important in considering potential protein structure. Hydropathy plots 
(Kyte and Doolittle, J. MoL BloL Vol. IS?, pp. 105-132 (1982)) indicate 
that the APC protein is notably hydrophillc. No hydrophobic domains 
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suggesting a signal peptide or a membrane-spanning domain were 
found. Analysts of the first 1000 residues indicates that o-helical rods 
may form (Cohen and Parry, Trends Biochem. Sel. Vol. 77. pp. 24S-248 
(1986); there is a scarcity of proline residues and, there are a number of 
regions containing heptad repeats (apolar-X-X-apoiar-X-X-X). Inter- 
estingly, in exon 9A, the deleted form of exon 9, two heptad repeat 
regions are reconnected in the proper heptad repeat frame, deleting 
the Intervening peptide region. After the first 1000 residues, the high 
proline content of the remainder of the peptide suggests a compact 
rather than a rod-lilce structure. 

The most prominent feature of the second 1000 residues Is a 20 
amino add repeat that Is iterated seven times with semiregular spacing 
(Table 4). The intervening sequences between the seven repeat regions 
contained 114. 116, 151, 205, 107, and 58 amino acids, respectively. 
Finally, residues 2200-24000 contain a 200 amino acid basic domain. 
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Thr Clu Uu Hit cyt vtl Thr Atp clu Arg 
910 *1* 



21(6 



2214 



22(2 



2310 



2358 



2405 



2454 



2502 



2550 



2598 



2(45 



2(94 



2742 



2790 
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~~r ica k.ck AGC TCT OCT CCC CAT ACA CAT TCA AAC ACT TAC 
{S Jg S3 K S S HI XU Al. HI. Thr HI. S.r A.0 Thr Tyr 
920 " 5 

— en AXT TCA AAT ACC ACA TCT TCT A?C CCT TAT 

E g s s s s s s »: «, «. «. ~ ? » 

. ,„ , u TXC uc ACA TCT TCA AAT CAT ACT TTA AAT ACT CTC 
JS £}. Si cS $ S Ar, S.r S.r A.« Mp S.r t.- A.n «.r VI 

,__ ... j^. CXT OCT T*T CCT AAA ACA CCT CAA ATC AAA CCC TCC ATT 
,Sr sir En £5 S Ttt Cly ly. Arg cly cm Mt Ly. rro S.r XI. 

9?0 » 75 
___ ... ICT C »A CAT CAT CAA ACT AAC TTT TCC ACT TAT CCT CAA 

cE lit T?I I" tu 1.1 *.p 01. s.r ly. fk. Cy. s.r Tyr cly cm 



as as si ^ as s s s * «; ^ ». 

1070 1075 



2830 



2806 



2934 



2902 



2020 



*xe CCA CCC 6 AC CTA CCC CAT AAA ATA CAT AOT OCA AAT CAT ATC CAT 2078 
£r S S S S All Hi. Ly. U* Hi. ..r Al. a.» Hi. H.t .j 

CAT AAT CAT CCA CAA «A CAT ACA CCA ATA AAT TAT ACT CTT AAA TAT 3136 

TCA CAT CAC CAC TTC AAC TCT CCA ACC CAA ACT e« TCA CAC AAT CAA 3174 
I« MP Si? ctn Mn S.r Cly Arj Cl« tu Pro S.r Cln A.n Clu 
2035 1040 io«. 

%f > % Tec CCA ACA CCC AAA CAC ATA ATA CAA CAT CAA ATA AAA CAA ACT 2222 

Arc Trp Ali A*9 So ty» Hi» »• »• «■ *. ? »• «*• «» 8 " 
1050 *055 

CAC CAA ACA CAA TCA ACC AAT CAA ACT ACA ACT TAT CCT CTT TAT ACT 3570 



1065 

CAC ACC ACT CAT CAT AAA CAC CTC AAC TTC CAA CCA CAT TTT CCA CAC 
til 5er Thr A»p A.p ty. U. t." ty. **• ft" Hi. f h. Cly Cln 
jO.O 1005 WW *v»« 

CAC CAA TCT CTT TCT CCA TAC ACC TCA CCC CCA CCC AAT CCT TCA CAA 
%Z c£ Si vS S.r^ro Tyr Aro S.r ArgMy Al. A.n Cly «.r01u 

ACA AAT CCA CTC CCT TCT AAT CAT CCA ATT AAT CAA AAT CTA ACC CAC 
£ En S 5il Cly S.r A.n Hi. Cly 11. A.n Cln A.n v ; l S.r Cln 
1U5 1120 1«* 

TCT TTC TOT CAA CAA CAT CAC TAT CAA CAT CAT AAC CCT ACC AAT TAT 
ITt Ul Cy. Cln Clu A.p A.p Tyr Clu A.p A.p Ly. Pre Thr A.n Tyr 
1130 I*** 1140 

ACT CAA CCT TAC TCT CAA CAA CAA CAC CAT CAA CAA CAA CAC ACA CCA 3510 
S.r Clu Arg Tyr Scr Clu Civ Clu Cln Hi. Clu Clu Clu Clu Arg Pro 
X145 1150 1155 



3316 



3365 



3414 



3462 
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• m tjiT XAT CAA CAC AAA OCT CAT CTC CAT CAC 

g js $ si s iss «' «• «- sr.- - 

1160 uw 

* *m ajla TAT CCC ACA CAT ATT CCT TCA TCA CAC 
CCT ATT CAT TAT ACT TTA AAA TAT CCC ACA ^ 6ifl 
fro lie A«p Tyr s#r^t.u ty» Trr ju« * ufQ 

-re TCA AAC ACT TCA TCT CCA CAA ACC ACT AAA 
AAA CAC TCA TIT TCA TTC TCA AAC ACT * ^ ^ ^ S€r ty- 

Ly. Cln S.r Fh« *.r str fr^* * 120$ 



119$ 

ACC CAA CAT ATC 
Thr Olu Hi. H*t 
1210 

AAT CCC AAC ACC 
Mn AX* ly. Arg 
1225 

ACT CCT CAC CCT 
Smr cly Cln Pre 
1240 

CAA CAA ACA ATA 

do ciu Thr tit 



1200 

T« TCA ACC ACT CAC AAT ACC TCC ACA CCT TCA TCT 
S ITr s.r Scr Clu A.n Thr S.r Thr fro s.r Str 
1215 i«0 

CAC AAT CAC CTC CAT CCA ACT TCT CCA CAC ACT ACA 
§E £1 S£ 2« HI. Fro 5tr S.r AL Cln S.r Arg 
1230 was 

CAA AAC CCT CCC ACT TCC AAA CTT TCT TCT ATT AAC 
ClS2y.AU Alt Thr Cy. ly. VaI S.r S.r XL A.n 
1245 1250 **** 

CAC ACT TAT TCT GTA CAA CAT ACT CCA ATA TCT TTT 

§U tS Sr 5i V*l «« *. P Thr Fro XI. Cy. W 
1260 " w ** 



sssssBSsasasgsas 

12f0 W». *«wv 

ACT ACC TCA CCT CAA CAT CCT 
Thr Arg Str Alt Civ A.p Fro 
1315 



ATA CCA CAA ATA AAA OCA AAC ATT CCA 
21. Alt Clu XI. Ly. cly ty. xi. Oly 

1305 1310 
CTC ACC CAA CTT CCA CCA CTC TCA CAO 

vll S.r Clu vtl Fro Alt V4l Str Cln 
1320 I* 2 * 

ACA CTC CAC CCT TCT ACT TTA TCT TCA 
Arc t.u Cln cly S.r str x*u s.r ft 
1240 



CAC CCT ACA ACC AAA TCC ACC 
Nit Fro Arg Thr Lyt S.r Scr 
1330 I" 5 

CAA TCA CCC ACC CAC AAA CCT 
Clu S«r Alt Arg Hit Lyt Alt 
1345 «50 



9 IX S S Hi 5 £ $ IS Si S §E IK 
& si gsT s s IK SI SI IS £ IS S SI 

~~ iai tct ACT TCT CTC ACT TCA CTT CAT ACT TTT CAC ACT CCT 

S2 Si £ S & «« *•» & 5 Phe 6Xu m 

1335 



3SS6 



3606 



3654 



3702 



3750 



3796 



3846 



3614 



3942 



3990 



4036 



4066 



4134 



4182 



4230 
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SffiSBgSSSSS.SSSffig.S 

IS s as s Si ss - - as K £ s 



146S 



OH 



4326 



4422 



44?0 



4S.8 



4S66 



4662 



rxe GTT CTT CCA CAT OCT CAT ACT TTA TTA CAT TTT CCC ACA CAA ACT 

8^3 22 £ S &** K; 0 Ph# AU Thr cx " 

.„ eKT CCA TTT TCT TCT TCA TCC ACC CTC ACT CCT CTC ACC CTC 

tS fS S£ Sg JE Mr Cy. Sor S.r jor^Uu *.r AL Uu jjrUu 

ex* CXC CCA TTT ATA CAC AAA CAT CTC CAA TTA ACA ATA ATC CCT CCA 4614 
1% cE ?~ m Ly. A.P V41 clu U« Aro n. Hot Pro Pre 
^jj5 1S20 15« 

err gag CAA AAT CAC AAT CCC AAT CAA ACA CAA TCA CAC CAC CCT AAA 
5S Sin Ota £ A.J ^ 01, A- clu Thr olu Sox 01. oil Pro Ly. 

CAA TCA AAT CAX AAC CAA CM AAA CAC OCA CM AM ACT ATT CM TCI 471: 
US Aon Cl« A.« Cl« CXtt I*. Clu AU Cl» Ly. Thr a. A.p S.r 
154S MSO 1858 

CAA AAC CAC CTA TTA CAT OAT TCA CAT CAT CAT CAT AIT CAA ATA CTA OSe 
tit S- aE U« U« A.p A.p S.r A.p ».p A.p A.? U- 01W n. Uw 
j$«0 IMS *» 75 

CAA CAA TOT ATT ATT TCT CCC ATC CCA ACA AAC TCA TCA CCT AAA CCC 
CW Clu Cyi XXi Uo S.r AM Met Pro Thr ly. »tr ».r At, ly. Cly 
uso l' w 18f0 

AAA AAC CCA CCC CAC ACT CCT TCA AAA TTA CCT CCA CCT CTC CCA ACC 
£2 52 S Cli tS AU Mr Ly. U« Pro Pro >ro WIU Arc 
* * ^595 1600 *ow5 

AAA CCA ACT CAC CTC CCT CTC TAC AAA CTT CTA CCA TCA CAA AAC ACC 

£. ??o ill ell Fro val Tyr Ly. Uu Uu Pro S.r Clu Aon Arc 
1610 161* lwg 

TTC CAA CCC CAA AAC CAT CTT ACT TTT ACA CCC CCC CAT CAT ATC CCA 
Cl5 Pro CU Ly. Hit V.l for Ph. Thr Pro Cly A.p A.p Mot Pro 
162S 1«C * M - 



4805 



48S4 



4902 



4950 
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$4- 



4996 



~w m CAA CCC ACA CCT ATA AAC TTT TCC ACA OCT ACA 

u< *" ,M % 

1640 1645 

- cxa TCC CCT CCA AAT 6 AC TTA CCT CCT 

IS Si iS S E S 5H S S ;~ m. « ». {i; «• 

- a* r sn s S5 s s ss s sj s e e e 

oly Civ Cly vaI^^v ciy «ir »*• j*g 0 * 16gs 

s s s s » ss s 3 s s e a s, s s sj 

&£SEEEE5E£EEEE£E 

1710 ' 

fli oXT XTT err CCA CAA TCC ATT AAT TCT CCT ATC CCC AAA CCC 
S£ S £J m £ AU «. cy. n. A.n ».r AU Hit Pro ly. Cly 

ft/^ «e ftfte eer TTC CCT OTC AAA AAC ATA ATO CAC CAC CTC CAC 
£ £ St £ S2 5 ty. Ly. U. *.t A.p Cln v ? l o «n 

CAA CCA TCT CCC TCC TCT TCT CCA CCC AAC AAA AAT CAC TTA CAT CCT 5«« 
fm S sir aS i«r 5.r J.r Alt Pro A.o ty. A.n Cln Uj Mp Cly 

17 *° 17 " 



5046 



5094 



5142 



5190 



salt 



5266 



5JB2 



5430 



mm - ... .* e m ccft act TCA CCA CTA AAA CCT ATA CCA CAA AAT ACT 

**• * S?o^ *"s V ' 1 ty ' "* m A ' ft 

CAA TAT AGC ACA CUT CTA ACA AAA AAT CCA CAC TCA AAA AAT AAT TTA 
ciu Tyr Arg Thr Ary V.l at, I*. Mo AU A.p Mr ty. A.A A.r. te» 
j79S 17*0 *'*» 

AAT OCT CAO ACA CTT TTC TCA CAC AAC AAA CAT TCA AA3 AAA CAC AAT M?» 

£ tit XrV v" Ph. ST A.p A.n ty. g M ly. ty. Gift A.^ 
ItOO AS0I 

ttc xxx AAT AAT TCC AAC CAC TTC AAT CAT AAC CTC CCA AAT AAT CAA 5126 
Lau Ly* Aan Mn Mr S? A.. Ph. A.n J^ty. U. Fro A.n *««. 

cat ACA CTC AOA CCA AC TTT OCT TTT CAT TCA CCT CAT CAT TAC ACC JS74 
S S Si Ej 3r s.r Ph. Alt Ph^A-p Mr Pro Hi. JM $ Tyr Thr 

CCT ATT CAA CCA ACT CCT TAC TCT TTT TCA CCA AAT CAT TCT TTC ACT S«22 
iV. ciC l\y Thr Pro Tyr MFM *« Aro A.n A.pS.r f « s.r 
X9S0 »•■«• 

TCT CTA CAT TTT CAT CAT CAT CAT CTT CAC CTT TCC ACC CAA AAC CCT S«TO 
I" Iml PM A.p A.p AC? MP V.l A.p f . Mr Mf Ol„ ty. Al. 
1865 1870 1575 
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jlcA KKO CCA AAA CAA AAT AAC CAA TCA CAC CCT AAA CTT ACC 5*16 
6UUU £g Ly. ty. 5 01« Aan Ly. Clu SerOlu AX* I*. Val 

. flC rir XCA CAA CTA ACC TCC AAC CAA CAA TCA OCT AAT AAC ACA CAA 5766 
Ser Kit Thr Clu Leu Thr Ser Aan Cln **£ $ Scr AU Mft lya ™f 0 CAft 

CCT ATT CCA AAC CAC CCA ATA AAT CCA CCT CAC CCT AAA CCC ATA CTT 5614 
Ala Zlo Ala Lye Oin Pro lit Ain Arg Cly cln Pro Lya Pro XI* Leu 
X915 lM0 1925 

CAC AAA CAA TCC ACT TTT CCC CAC TCA TCC AAA CAC ATA CCA CAC ACA 5662 
Cln Lya Cln Ser Thr Pha Pro Cln Ser Ser Lya Asp Xlt Pro Aap Arg 
1$30 1935 1940 

CCC CCA CCA ACT CAT CAA AAC TTA CAC AAT TTT OCT ATT CAA AAT ACT 5910 

civ Ala Ala Thr Aap Clu Lye Leu Cln Aan Pha Ala Xla Clu Aan Thr 
1 1945 1950 1955 

CCA CTT TCC TTT TCT CAT AAT TCC TCT CTC ACT TCT CTC ACT CAC ATT 6966 
Pro Val Cya Pha 5ar Hia Aan 6ar Sar Lau 5ar 6ar Uu 6«r Aap Xla 

liaO . 1965 1970 1975 

CAC CAA CAA AAC AAC AAT AAA CAA AAT CAA CCT ATC AAA CAC ACT CAC 6006 
Aap Cln Clu Aan Aan A»n ly Clu Aan Clu Pro Xla ly Clu Thr Clu 
I960 1985 1990 

CCC CCT CAC TCA CAC CCA CAA CCA ACT AAA CCT CAA CCA TCA CCC TAT 6054 
Pro Pro Aap Sar Cln Cly Clu Pro Sar Lya Pro Cln Ala Ser Cly Tyr 
* 1995 2000 2005 

CCT CCT AAA TCA TTT CAT CTT CAA CAT ACC CCA CTT TCT TTC TCA ACA 6102 

Ala Pro Lya Sar Pha Hia Val Clu Aap Thr Pro Val Cya Pha Sar Arg 
2010 2015 2020 

AAC ACT TCT CTC ACT TCT CTT ACT ATT CAC TCT CAA CAT CAC CTC TTC 6150 
Aan sar sar Uu Sar Sar Leu Sar Xla Aap Sar Clu A.p Aap Leu Leu 
2025 2030 2035 

CAC CAA TCT ATA ACC TCC CCA ATC CCA AAA AAC AAA AAC CCT TCA ACA 6196 
cln clu cya Xla sar s«r Ala Mat Pro Lya Lya Lya Lya Pro Ser Arg 
3040 2045 2050 2055 



CTC AAC CCT CAT AAT CAA AAA CAT ACT CCC ACA AAT ATC CCT CCC ATA 6246 
Leu Lya Cly Aap Aan Clu Lya Hia Sar Pro Arg Aan hat Cly Cly Zle 
2060 2065 2070 



TTA CCT CAA CAT CTC ACA CTT CAT TTC AAA CAT A*A CAC ACA CCA CAT 6294 
Leu Cly Clu Aap Leu Thr Leu Aap Leu Lya Aap X a Cln Arg Pro Aap 
2075 2060 2055 

TCA CAA CAT CCT CTA TCC CCT CAT TCA CAA AAT TTT CAT TCC AAA CCT 6342 
Ser Clu Hia cly Lou Ser Pro Aap Ser Clu Aan Pha Aap Trp Lya Ala 
2090 2096 2100 

ATT CAC CAA CCT CCA AAT TCC ATA CTA ACT ACT TTA CAT CAA CCT CCT 6390 
Xla Cln Clu Cly Ala Aan Ser Xle val Ser Sar L«u Kit Cln Ala Ala 
2206 2110 2115 
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— . -#~r ici CAA CCT TCC TCT CAT TCA CAT TCC ATC 

s s si s: s ss s si :« & «, < 

- 3 s gas 55 s is. 52 s: * s a" }2 

55 3 S JSSSSffiKS! 3 S 3 3KS E 

2200 2205 

SSSBgB5SB = 2}SSS£B 
SI S Si 5 SI a S IS £ S B S S SS S 

3 si &s ~ s * in," ss 2f - £ s s 
g s as s 55 = s a a si ss SI S 3 

2265 2270 

ssassasasssssssg, 

2280 « 85 ** 

»gj gCA CCT TCT ACA TCA CCA TCT ACA CAT TCC ACC CCT TCA ACA 
S $ S fro grut cr dy ..r Arg M, .«r Thr Pre grAr, 

„_ _ rr oa CCA TTA ACT AC* CCT ATA CAC TCT CCT «C CCA AAC 
S S SS rS S Xr ? Pro II. Cln S.r Uo 0 y Ar C A.n 

23X5 • 
M mv* tcp CCT CCT ACA AAT CCA ATA ACT CCT CCT AAC AAA ITA TCT 

£ S S S £ *«« f&»- ■« *• L ' u s-r 

2330 2235 " 

%% _m «££ ACA TCA TCC CCT ACT ACT CCT TCA ACT AAC TCC TCA 

SS EE JS tS IS s«>ro s.r Thr Alt S^Thr ty. S.r St 



64J£ 



6486 



6534 



6S62 



6630 



6676 



6726 



6774 



6622 



6670 



6916 



6966 



7014 



7062 



7110 



234S 
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m _ „ ... A cA TCT CCA OCT ACA CAC ATC ACC CAA 

% S SJ & S 5w S «« •« «■ - 8?, 

^ XAX AC* CCT TTA TCC AAC AAT CCC ACT AC? ATT 
CAC AAC CTT ACC AAA CAA ACA R A*A All Mr S.r XU 

cln A.n Uu Thr Jyj 0 «*« ™ aM , ' 2 J«0 

. ... «... «-r ere Tee AAA CCA CTA AAT CAC ATC AAT AAT CCT 
S Ar, E S2 E £ £ |S U- A-" - MnA.n CXy 

... ftftX eee AAT AAA AAC CTA CAA CTT TCT ACA ATC TCT TCA ACT AAA 

£ Sy aK En 5. 5! V.» OX. U. Mr Ar* Mt Mr 8.r Thr Ly. 

2410 2411 2420 

^ CAA TCT CAT ACA TCA CAA ACA CCT CTA TTA CTA CCC 

s s s £ ,€r ciu xtg nv* x uu vtl 9 

2430 



3425 



7156 



7206 



7254 



7102 



73S0 



7398 



7S42 



act TTC ATC AAA CAA CCT CCA ACC CCA ACC TTA ACA ACA AAA 

SK S £ »5 S «■ ^ >** *« Thr * tt *** %U 

2440 2441 « « 

•we exe CAA TCT CCT TCA TTT CAA TCT CTT TCT CCA TCA TCT ACA CCA 7446 
2 SS SS 12 51 IS Ph. «1« »« U« ».r >ro Ser Ser Ar, Fro 
2460 246$ 2470 

CCT TCT CCC ACT ACC TCC CAC CCA CAA ACT CCA CTT TTA ACT CCT TCC 7494 
S2 III t£ Arg S.r Cln AU Cln Thr Pre v*l U« Str Pre s.r 
2475 2410 2«s 

err CCT CAT ATC TCT CTA TCC ACA CAT TCC TCT CTT CAC CCT CCT CCA 

Su S S ^ 2G ser Thr Hi. s.r Ser v.: cln Alt Cly Cly 
2490 249S 2100 

TCC CCA AAA CTC CCA CCT AAT CTC AST CCC ACT ATA CAC TAT AAT CAT 7690 
trj tej Lyi U u fro fro M» Q M" S.r Pro Thr »« 9 C1 " *•» *•» 

CCA ACA CCA CCA AAC CCC CAT CAT ATI CCA CCC TCT CAT TCT CAA ACT 78J8 

ES tu Si Sc Hi. A. F 1X8 AX. Ar.S.r 81. Mr Clu S.r 
2520 "2* 2530 « 535 

ccr TCT ACA CTT CCA ATC AAT ACS TCA CCA ACC TS6 AAA CCT CAC CAC 7686 
SSISISmU: A.» AT, f« Cly Thr Trp ty. Arc CXu HI. 

2S40 aM * 3110 

XBC AAA CAT TCA TCA TCC CTT CCT CCA CTA AOC ACT TCC ACA ACA ACT 77*4 
Mr tyi Mil S«r S.r Mr Uu rro Arg V4l Mr Thr Trp Ar, Arg Thr 
2SSS ' Mo 

CCA ACT TCA TCT TCA ATT CTT TCT CCT TCA TCA CAA TCC ACT CAA AAA 7782 
Sly sir sir S.r S.r lit Mu Mr AX. Mr 8.r clu Mr »mt cl« Ly. 
2170 2575 2SSQ 

CCA AAA ACT CAO CAT CAA AAA CAT CTC AAC TCT ATT TCA CCA ACC AAA 7830 
AU Lyi S«r Clu Mp Clu lyt Bit V4l A.n Ser He ser cly Thr tyt 
25$5 2590 2S9S 



WO 92/13103 t . PCT/ US 9:/00376 

-16- 



in- CTA TCC CCA AAA CCA ACA TCC AC A AAA ATA 

is s 5: st £ as s m. ». «r~ «» *• <>• 

2600 2§0 * 

rre irt XAT ACT ACT TCT CAC ACC C7T TCC 
AAA •* *« CAA ITT TCT CCC ACA AJT ACT ^ ^ ^ ^ >f 

ty» Ci« Ain Clu ™ 0 * €r 2$25 2630 

is sj 2 s e s s a ~ - - £» = 

2635 

5S2S iil 55 JS.JS SS S} E SS S? S B 

^ «■« era X« CAC ACT CTT TCA 6AA AM CCA AAT CCA AAC ATT AAA 
g S S S ™ S8 S.r 61, ty. Jl. A.n Fro A« U. Ly. $ 

266* 



#*r«r »er ere CCT TTC CAA AAT CSC CTC ACC TCC TTT ATT CAC 

s an g ss sj s ciu u. » mn. «» 

... oee eer CAC CAA AAA OCA ACT CAC ATA AAA CCA CCA CAA AAT 

en 3 ss s $ «. «. «. 

%m _ rTr CXA TCX gas j^cT AAT CAA ACT CCT ATA CTC CAA CCT 
SSS5SSS5 A.n Pro II. v.! 61. Arg 

2745 f SO 2759 

kCC CCA TTC ACT TCT ACC ACC TCA AOC AAA CAC ACT TCA CCT AC? 666 
gS s-r J.r MrMr S.r f .r ty. Hi^s.r Mr fro S.r Cly 

ACT CTT 6CT 6CC A6A 6T8 ACT CCT TTT AAT TAC AAC CCA ACC CCT A66 



Thr val All AU Ar« vS £ V~ Ph. A.» Tyx A.n Pro Jr. Ar 9 

37SO 3US ">0 

.J* ACC ACC CCA CAT ACC ACT TCA CCT CCC CCA TCT CAC ATC CCA ACT 
Mr S.r AU A»p s« Thr f.r AU Ar 9 Pro s.r Cln Xi. Pro Thr 

1 2715 2800 

CCA CTC AAT AAC AAC ACA AAC AA5 CCA CAT TCC AAA ACT OAC ACC ACA 
S !2 A.n Mn A.n Thr ty. ty. Arg A.p 3er ty. Tftr A.p s.r TAr 
2BX0 

CAA TCC ACT CCA ACC CAA ACT CCT AAC CCC CAT TCT CCC TCT TAC CTT 

c£ IS £r cij tS Cin «« fro Ly. Ar« Hit S.rOly S.r Tyr l«y 
2J2$ 2830 2835 



787fl 



7926 



7974 



8022 



8070 



-8118 



2880 

T fA AAA CAT AAT CAC CCA AAA CAA AAT CTC CCT AAT CCC ACT CTT 8X66 
S 1% # S SS U8 ^ Cin A.n $ VAl Oly A.n Cly S.rV.1 



8214 



6262 



8310 



63S6 



6406 



84S4 



8S02 



6530 
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CTG ACA TCT CTT TAAAACACA6 CAACAATCAA ACTAACAAAA TTCTATCTTA $602 

Val Thr Sar Va,l 
2640 

ATTACAACTG CTAtATAOAC ATTTICTTtC AAATGAAACT TTAAAACACf 6AAAAATTTT 8662 

CTAAATACCT TTOATTCTTO TTAOACGCTT TTTCtTCTOC AAGCCATATT TOATAOTATA 8722 

CTTTCTCTTC ACTCCTCTTA TttTCCGACC CACTCTTCAT CG7TAGCAAA AAATACAAAG 6762 

CCAACTATCT TTCTACACTA TCTTTTACAT GTATTTAAAO IAOCATCCCA TCCCAACTTC 6842 

CTTAATTATT 0CTTG7C7AA AAtAATCAAC ACTACAGATA CCAAATAT6A TATATTGCTO 6902 

TTATCAATCA TTTCTAGATT ATAAACTCAC TAAACTTACA TCAGGGGAAA ATTGCTATTT 6962 

ATGCAAAAAA AAAATGTTTT TCTCCT1GTC AGTCCATCTA ACATCATAAT TAATCATCTC 9022 

CCTGTOAAAT TCACACTAAT ATCCTTCCCC ATCAACAAGT TTACCCAGCC TGCTtTCCTT 9082 

ACTCCATCAA TGAAACTGAT OGTTCAATTT CACAACTAAT CATTAACAGT TATCTGGTCA 9142 

CATCATCTCC ATACAGAIAC CTACA6TGTA ATAATTTACA CTA7XTTCTC CTCCAAACAA 9202 

AACAAAAATC TCTCTAACTC TAAAACA7TG AA7GAAACTA TTTTACCTGA ACTA6ATTTT 9262 

ATC7CAAAGT AGCTACAATT TTTGCTATCC TCTAATITCT TGTATATTCT GCTATTTGAG 9222 

CTCAGATGGC TCCTCTTTAT TAATCACACA TGAATTCTCT CTCAACAGAA ACTAAATGAA 9362 

CATTTCACAA TAAATTATTC CT6TAT0TAA ACTGTTACTC AAATTCGTAT TTCITTGAAG 9442 

CGTTTCTTTC ACATTTCTAT TAATTAATTG TTTAAAATGC CTCTTTTAAA AGCTTATATA 9802 

AATTTTTTCT TCAOCTTCTA TCCATTAACA GTAAAATTCC TCTTACTGTA ATAAAAACAT 9862 

TCAACAAGAC TCTTCCCACT TAACCATTCC ATCCGTTCGC ACTT 9606 

(2) ZHFORKATXON FOR SZQ tP K0i2i 

(i) sreOEKCI OtAAAettRXSTXCS; 

(A) LENGTH* 264) amino acids 
(8) TXftt aaino acid 
(D) TOPOLOGY 1 linaar 

(ii) MOLECULE TTPEt protain 

(Xi) SEQUENCE DKCWPTIONi SZQ ID KOl2: 

Hat Ala Ala Ala Sac Tyr Aap Oln Lau Lau Lya Cln v«l Glu Ala Uu 

1 8 10 18 

Lya Met Glu Asn smr Aan Lau Arg Gin Clu Lau Clu Aap Aan sar Aan 
20 28 30 

His Lau Thr Lya Leu Glu The Glu Ala Sar Aan Max Lya Clu Val Lau 
35 40 48 

Lya Cln Lau Cln Cly Sar Zla Glu Aap Clu Ala Mat Ala Sar Sar Cly 
SO 55 60 
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u. ix. *.p i- u- * u ty ' 61u *s MB Uu A,p Mr 'S 

„„ Ph. Pro Oly v.i ty. t.« Ar, S.r ty. H.t Mr Mu Ar; S.r tyr 

«y s.r Ax, Jl. Oly g *" Cly eta g{ *" "° 

v.l Pro M.t Cly f.r Ph. Pro Arg Arg Oly Ph. v.l M» Cly s.r Arg 



Cl« S.c Thr Cly Tyr u. «u el. U. Ci. ty. CI. Arg Mr tec U. 
130 * 3S 

AX. A.p U. A.P ty. ci. clu ty. 61. ty. A.p trp tyr xyr hU 



us 



do Uw Cln M« &h thr ty. Arg II. A.p Mr t.u Pro t«u thr CXu 

A.n Ph. Mr t.u cm thr A.p Uu m Arg Arc Cln Uu Clu Tyr Cl« 
1.0 lM *" w 

xl. Arg eio II. Arg v.l Al. M«t Clu Clu Cln U. Cly thr Cy. Cln 
X9t 300 «.» 

A.f M.t Clu ly. AT9 Al4 Cln Arg Arg II. Al. Arg II. Olo Cln II. 

Clu ty. A.p XI. too Arc II. Ar 9 Cln Uu Uu Olr. s.r cln Al. thr 
32$ 230 «*. 

CI. xl. Clu Arg Mr Mr Cln A.n ty. Hi. Olw thr Oly Mr Hi. A.p 

Al* Clu Arc Cln A«n Clu Cly Cln Cly v.l Cly Clu XI. A.n M.t 

Thr Mr Cly A.n Oly Cln Cly Mr Thr thr Arc M«t A.p Hi. Clu thr 

XI. Mr v»l L.u Mr Mr Mr Mr thr Hi. S.r Al. Pro Are Arg Uu 
290 295 200 

Thr S.r Hi. Uu Cly thr ty. v.l Clu H«t v.l Tyr $.r Uu Uu f.r 

305 310 

Met i.u Cly Thr Hi. A.? ty. A. P Mp £« Mr <*« thr Uu M AX. 

M.t S.r *«r Mr Cln Aep f.r Cy. XX. Mr M.t Arg cxn Mr CXy Cy. 
340 * 41 "° 

Uu Pro t«u t«u IX. Cln Uu uu Hi. CXy A.n A.p ty. A.p s.r v.l 
jjj 3*0 365 

Uu Uu Oly Aan Mr Arg CXy Mr ty. CXu AX. Arg Al. Arg Al* Mr 

XI. Al. Im Hi. A.n II. II. Hi. Mr Cln Pro A.p A.p ty. Arg Cly 
38$ 390 3 « ,vw 
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Ar, Arg Clu llm Arg Vtl Uu Hi* Uu Uu Clu «•> lie **9 M« Tyr 

Cy. Clu Thr Cy. Try Clu trp Cln Cl» KU HI. Clu Pre Cly M.t A.p 

Cln Aop tya Pro Met Pro Ala Pro Vol Clu Hii Cln lit Cy. Pro 

Ala Vol Cyf Vol Uu Met tyo Uu Sor Phe Asp Clu Clu Hit Arg Hio 
4S0 *M 440 

AU K«t Aon Olu Uu Oly Cly Uu Cln Ait Xlo Ala Clu Uu Uu Cln 
465 470 475 oow 

val Aop Cyo Clu Mot Tyr Cly Uu Thr Aan Aop Hio Tyr Sor Xlo Thr 

4IS 4t0 495 

Ug Arg Arg Tyr Alt Cly Mot Alt Uu Thr Aon Uu Thr Pho Cly Aop 

Vol Alo Aon tyo Alt Thr Uu Cya oor not tya Cly Cyo Mot Arg Alo 
$H S20 125 

Lou Vol Alo Cln Uu Lyt sor Clu sor Clu Aop Uu Cln cln Vol Xlo 
SJO 535 S40 

Alo Sor Vol Uu Arg Aon Uu sor Trp Arg Alo Aop Val Aon Sor tyo 
$45 550 $$5 540 

Lyo Thr Lou Arg Clu Vol Cly Oor Vol tya Alo Uu Mot Clu Cya Alo 

Lou Clu vol Lyo tya Clu Sor Thr Uu tyo Sor Vol Uu Sor Alt Uu 
580 585 590 

Trp Aon Uu Sor Alo Hit Cyo Thr Clu Aon tyo Alo Aop Zlo Cya Alo 
095 400 505 

Vol Aop Cly Alo Uu All Pho Uu Vol Cly Thr tou Thr Tyr Arg Sor 
610 415 «20 

Cln Thr Aon Thr Uu Alo Xlo Xlo Clu Sor Cly Cly Cly Xlo Uu Arg 
62S 630 4JS 640 

Aon Vol Sor Oar Uu Xlo Alo Thr Aon Clu Aop Hio Arg Cln Xlo Uu 
645 650 655 

Arg Clu Aan Aan Cyo Uu Cln Thr tou Uu Cln Mio Uu Lyo sor Kit 
v 660 665 510 

Sor Uu Thr Xlo Vol Sor Aon Ala Cyo Cly Thr Uu Trp Atn Uu Sor 
675 603 60S 

Alo Arg Aon Pro tyo Aop Clr. Clu Alt lou Trp Atp Hot Cly Ala val 
690 60S 700 

Sor Met Leu Lyo Asn Uu Xlo Hie Sor Lyo Kit Lyo Mot Xlo Ala Not 

70S 710 715 720 

Cly Ser Alo Alo Alo Uu Arg Aon Uu Not Ala Aon Arg Pro Ala Lya 

725 720 725 
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Ty* ty. A.p Al. A.n II. H.t Ser ft. Cly S.r «« Uu £0 Ser Uu 

Hi. V.l Arg ty. «« ty. Al. U- Clu Al. Clu t.U MP Al. Cln Hi. 
7S5 760 



nk« no Ac a lit A»? tmu Ser Pro tyi AU $cr 
imu s«r Clu Thr Pb» *«P A«n ?gc 

770 775 
Hi. Arg ser ty. Cln Arg Hi. ty. Cln S.r te« Tyr cly A., Tyr Jg 



, M 790 '« 



pa. A.p Thr A.n Arg Ml. Mp A.p A.h Ar, s.r A.p MR Ph. Mn Thr 

•OS ** v 
Cly A.n H.t gr V.l t.u S.r Fro Tyr t.u Mn thr Thr v.l Uu Fro 

Ser Ser Ser Ser S.r Ar, Cly s.r Uu A.p s.r S.r Arg S.r Clu ty. 

A.p Arg S.r U» Clu Arg Clu Arg Cl r «• Cly Uu Cly A.« Tyr Hi. 

ISO >** 
Pro Al. Thr CI. As« Uo Cly Thr Ser S.r ty. Arg Cly Uu Cl» U» 



865 



S«r Thr Thr Al. Al. Cln II. AU ty. v.l M.t Clu Clu v.i ser Al. 

U« Hi. Thr S.r Cln Clu A.p Arg •« «.r Cly S.r Thr Thr Clu Uu 
gOO 90S " w 

Hi. Cy. V.I Thr fc.p Clu Arg Aer. Al. Uu Arg Ar 9 S.r S.r Al. Al. 
915 MO '«* 

Hi. Thr Hi. s.r Aon Thr Tyr A.n Ph. Thr ty. S.r Ciu A.n S.r A.a 
910 935 »«0 

Arg Thr Cy. s.r M.t Pro Tyr Al. ty. U» Clu Tyr ty. Arg S.r S.r 
94; 9S0 555 

A.n A.p ser Uu Mn Ser Vel S.r Ser A.n A.p Cly Tyr Cly ty. Arg 

Cly Cln net ty. Pro Ser He Clw Ser Tyr Ser Clu A.p A.p Clu Ser 
* 900 985 "O 

ly. Phe cy. ser Tyr Cly Cln Tyr Fro Al. A.p Uu Al. Hi. ty. lie 
995 10CO iwua 

Hi. ser Al. A.n Hi. Met A.p A.p A.n A.p Cly Clu tea A.p Thr Pro 
1010 l02 '' 

XI. A.n Tyr S«r teu ty. Tyr s.r Mp Clu Cln Leu A«n S.r Cly Arg 
x0 25 . 1030 1035 10.0 

cln Ser Pro Ser Cln A.n Clu Arg Trp Ale Arg Pre ty« Hi. lie lie 
1045 10X0 1055 

clu A.p Clu II. ty. Cln S.r Clu Cln Arg Cln Ser Arg Mr Cln Ser 
106C io. » ,0TO 
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Thr Th» Tyr rro V»l Tyr tnr Clu s« Thr M? Mp lyj Mi. L«u ty. 
1075 IOBQ juo* 

Phe Cln^Pro Hii Pht Cly Cln^Gin Clu Cyt vtl Mt^Pro Tyr Arg Ser 
Arg Cly AU Aen Cly Ser Clu Thr am Ar 9 Vtl Cly Ser Am Hit Cly 

110% 1120 

Ut Asa Cin Atn v«l Ser Cin $tr Leu Cyt Cln Clu Atp Atp Tyr Clu 
U25 1130 1135 

Atp Atp tyt Pre Thr Atn Tyr Str Clu Arg Tyr str Clu Clu Clu cin 

r mo H« 5 H* 0 

Hit Clu Clu Clu Clu Arg Pro Thr Atn Tyr Str lit Lyt Tyr Atn Clu 
1135 11*0 1165 

Clu tyt Arg Nit Vel Atp Cln Pro lit Atp Tyr Str Leu Lyt Tyr Alt 

ii7o ins iiso 

Thr Atp lie Pro Ser str Cln tyt Cln Ser Pht Str Phe Str tyt Ser 
lies H»0 1195 1200 

Str Str Cly Cln Str Str tyt Thr Clu Nit Hot tor str Str ser Clu 
1205 1210 1215 

Atn Thr Str Thr Pro Str Str Atn Alt tyt Arg Cln kmn Gin Lou Hit 

1220 1225 1230 

Pro Str Str Alt Cln S«r Arg Str Cly Cln Pro Cln Lyt Alt Alt Thr 
1235 1240 1245 

Cyt Lyt Vtl Ser Str Zlt Atn Oln Clu Thr lit Cln Thr Tyr Cyt Vtl 
1250 1255 1260 

Clu Atp Thr Pro tit Cyt Pht Str Arg Cyt Str Str Lou Str Str Lou 

1265 1270 1275 1280 

Str Str Alt Olu Atp Clu lit Cly Cyt Atn Cln Thr Thr Cln Clu Alt 
12S5 1290 1295 

Atp Str Alt Atn Thr Lou Cln Zlt Alt Clu lit Lyt Cly Lyt lit Cly 
1300 1305 1310 

Thr Arc Sor Ala Olu Atp Pre Vtl Str Clu Vtl Pre Alt Vtl Str Cln 
1215 1320 1325 

Hit Pro Arg Thr Lyt Str Scr Arg Ltu Cln cly str sor Leu str Str 
1320 1335 1340 

Clu Str Alt Arg Nit Lyt Alt Vtl Clu Pht Pro Str Cly Alt Lyt Str 
134S 1350 1355 136C 

Pre Str tyt Str Cly Alt Cln Tnr Pro Lyt Str Pro Pre Clu Hit Tyr 
1365 1370 1375 

Vel Cln Clu Thr Pro Lou Not Pht Str Arg Cyt Thr Str Vtl Str Ser 
1380 1365 139C 

Leu Atp Ser Phe Clu Str Arg Str He Alt Ser Ser Vtl Cln Ser Clu 
1395 1400 1405 



pre Cy. S.r CXy Met V.X S.r CXy XI. XI. S.r M P t.« ,ro 

14X0 

A.p S.r Pro CXy CXn Thrh.t Pro Pro s.r Ar^Mr ty..Thr Pro j„ 
1425 WW 

P te Pro Pro CXn JJM. 0U a- gj^ »" J-*. 
XX . Thr hUCXu ty. Arc CXu MrWy. Pro ty. Cln JU AU Vol 
M« XX* JU V.X Ara V.X ClnV.X to- Pro M P AX. Mp Thr Uu 

t.u hi. Ph. AX. Thr GXu Mr Thr Pro A.p CXy Ph. S.r Cy. ».r S.r 
X4S0 i49S 

S. r fu S.r "° Wn ^ "to 

xsos *» iW 

OIu t.u Ar, XI. H«t Pro Pro V.X CXn M. Ma A.p A.A CXy A^CXu 



1S2S 



Thr CXu 5.r ex. CXn Pro ty. Clu Mr M* CX» Mn CXn eg ty. CXu 

1140 XMi *«* 

AU CXu ty. Thr XX. MP S.r Gl» ty. A.p U» U« A.p A.p S.r A.p 
1555 I>o0 

A.p A.p A.p XX. CXu IX. Mu Clu OX. Cy. XX. XX. S.r AX. Mt Pro 

XS70 15,5 1999 

Thr ty. S.r S.r Ar, ty. CXy ty. ty. Pro AX. CXn TAr AX. Sor ty^ 
1585 

1» pro Pro Pro v.X $ AX. Ar, ty. Pro U:Cl* Uu Pro V.l Tyr^y. 

f« Pro s.r CXn Mo Ar, t.« CXn Pro CXn ty. Hi. v.XMr Ph. 
1620 1825 * OJW 

Thr Pro CXy A.p A.p M*t Pro Ar, v.X Tyr Cy. v.X CXu CXy Thr Pro 

1635 1« 4 - **" 

IX. A.n Ph. S.r Thr AX. Thr S.r t.u S.r A.p Uj Thr IX. CXu Mr 

X6S0 X* 5 * * ow 

Pro^ro A.» CX. Uu U^AX. CXy CXu CXy V.X $ Ar, CXy CXy AX. CX^ 

s.r CXy CXu Ph. CX^ty. Ar, A.p thr XX.PTO Thr CXu CXy Ar^Mr 

,hr A.p CXu AX. CXn CXy CXy ty. ISr $ S.r S.r V.X Thr XX^Pr. CX« 

Le U A.p KBph'n ty. AXi CXu Clu^Cly A.p IX. t.u AX.^CXu Cy. IX. 

A.n s.r AX. M.t Pro ty. CXy ty. s.r Hi. ty. Pro Ph. Ar« V4l ty. 
X130 1755 * 
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tyt XI* »«P «" V*l Gin Cln hU S#r M« »*r Str »*r XI. Pre 
}745 1750 »"■■ *»ow 

Mi. tyt A.n Cln uw M P Cly Ly. ty. ty; ty. >ro Thr S.r too V.l 
1755 17/0 . _ *''» 

tyt Pro lie Pro Cln Atn thr Olu Tyr Arg Thr Arg Vtl Arg tyt Atn 
7 X780 i7M 1 

Alt Atp Str tyt Atn Atn Uu Atn All Clu Arg Vtl Pht Str Atp Atn 
1795 1800 1805 

tyt Mp Str Lyt tyt Cln Atn Uu tyt Atn Asn Str Lyt Atp Pht Atn 
1810 1*1* 1830 

Atp tyt ttu fro Atn A.n Clu Atp Arg Vtl Arg Cly Str Pht Alt Pht 
1825 M30 I* 40 

Atp tor Pro Hit Nit Tyr Thr Pro lit Clu Cly Thr Pro Tyr Cyt Pht 
1845 1**0 ISIS 

Str Arg Am Atp Str Uu Str Str Uu Atp Pht Atp Atp Atp Atp Vtl 
18(0 1S6S 1*70 

Atp Uu Str Arg Clu tyt Alt Clu Uu Arg tyt Alt tyt Clu Atn tyt 
F 1875 V 1S80 1SS5 

Clu Str Olu Alt tyt Vtl Thr Str Hit Thr Clu ttu Thr Str Atn Cln 
IttO 1695 1900 

Cln Str Alt Atn tyt Thr Cln Alt lit Alt Lyt Cln Pro Xlt Atn Arg 
1905 1*10 1*1S 1920 

Cly Cln Pro tyt Pre lit Uu Cln Lyt Cln Str Thr Pht Pro Cln Str 
7 IMS 1930 1935 

Str tyt Atp lit Pro Atp Arg Cly Alt Alt Thr Atp clu tyt Uu Cln 
1940 1945 1950 

Atn Pht Alt lit Clu Atn Thr Pro Vtl Cyt Pht Str Kit Atn Str Str 
1955 19*0 1965 

Uu Str Str Uu Str Asp lit Atp Gin Clu Atn Atn Atn tyt Clu Atn 
1970 197S 1960 

Clu Pro Zlt tyt Clu Thr Clu Pro Pro Atp Str Cln Cly Clu Pro Str 
1985 1990 199S . 2000 

tyt Pro Cln Alt Str Cly Tyr Alt Pro tyt Str Pht Hit Vtl Clu Atp 
2005 2010 2015 

Thr Pro Vtl Cyt Pht Str Arg Atn Str Str U; Str Str Uu Str lit 
2020 3025 20J0 • 

Atp Str Clu Atp Atp Uu Uu Cln Clu Cyt lit Str Ser Alt Met Pre 
2035 2040 2045 

tyt tyt tyt tyt Pro Str Arg Uu tyt Cly Atp Atn Clu tyt Bit Str 
2050 2055 2060 

Pre Arg Atn Mtt Cly Cly Xlt Uu Cly Clu Atp Uu Thr ttu Atp ttu 
2065 2070 2075 2080 
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ei« .ro fro A.P S.r CXu Hi. CXy U- Pro *•§ «« 
I,y. Afp XX. 2W 2090 

6U MB m } .g«p tr. x*- * u A,n JS. 1 * vtl 

»r s« i-^ «• *u m cy u« cm AX. 

,. r ,„ A.p Mr IUU. Mr Uu Ly. & CXy XI. Mr U. 



2X30 



6ly ,.r >ro Ph. Hi. Wj Thr Pro A-P CXn CUCW ty. Pro Th^ 



axis 2»° 



s.r a.« ty. «iy xx. uu ty. ProCXy ex. ty. ..r J**. 

6 x„ Thr ty. ty; xx. cx» Mr cx« m; ty. cxy xx. xy. exjrexy ty. 
ly . V.X Tyr s ty. ..r f - XX. ThrWy Xy. v.X Ar, S«*.n S.r CXu 
XX. sirGXy CX« K.* ty. CXyro U„ OXn AX. A.» o K.t Pro Mr IX. 
S« Arc CXy Ar, Thr Hi. XX. Pro CX^VoX at, A.n S.r M^ 



223S 



S.r S.r TAT S.r P~ v.x s.r ty. ty. gjm Pro to- ty. TtePr. 



324S 



XX. s.r ty. a.r Pro Ser CX« CXy cxn Thr AX. Thr Thr Mr*. Arc 
2240 

CXy AX. ty. $ Pro S« V.X ty. MrCX. U. Mr Pro V.X $ AX. Arc CXn 
, M «« o CXB XX. ciy cxy Mr^r ty. AX. P" s^Ar, S«r CXy S«r 
ArgA.p S.r Thr Pro MrAr, Pro XX. CXn CUPro t.u S.r Ar, iw 
cm s.r Pro 

232S 

s.r Pro Pro Mnty. f« Mr CXn to^Pro Ar, Thr S.r SorPro sor 



XX. CXn S.r Pro CXy^Aro x.n Mr XX. ..^Pro CXy Xr, A.n CXy $ XX. 



,h« AX. ».r $ Thr ty. S.r S.r CXy Mr CXy ty. M.t S.r $ T>r Thr S.r 
P« Cly_Ar« CXn Mt S.r CXn «n A.n Uu Thr ly^CXr. Tnr CXy Mu 



2370 



S. r t y. A.n AX. S.r S.r XX. Pro Ar, S.r Ciu Scr XX. Mr ty. C* 
2385 23 *° 

U.u A.n CX« Hot A.n Mn CXy X.n CXy AHA., ty. ty. v.X CM Mo 
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ser Arg net ser Str Thr Lye Ser Ser Cly Ser clu Ser hep Arg Ser 
2420 2428 300 

Clu Arg Pro Val Leu VaX Arg CXn Ser Thr Phe XXe Lye Clu AXa Pro 
2435 2440 **** 

ser Pro Thr Leu Arg Arg Lyi Uu Clu Clu Sec Ma 5er Phe Clu Ser 
2450 24SS 2<SC 

Leu ser Pro Ser Ser Arg Pro AXa Ser Pro Thr nrg Ser Cln Ala Cln 
2465 3470 247S 2400 

Thr Pro Val Leu Ser Pro Ser Leu Pro Asp Mat Ser Leu Ser Thr Me 
2485 3490 2495 

Ser Ser Val Oln Alt CXy Gly Trp Arg Lye Leu Pro Pro Aan Leu Ser 
2S00 2S0S 2510 

Pro Thr lie CXu Tyr Aan Aap CXy Arg Pro AXe Lya Arg Ria Aap ZXe 
25X5 2520 2525 

Ale Arg Ser His Ser OXu Ser Pro Ser Arg Leu Pro XXe Aan Arg Ser 
2520 2535 2540 

CXy Thr Trp Lya Arg OXu Hii Ser Lya ait Ser Ser Ser Leu Pro Arg 
2545 2550 2555 2560 

Val Smr Thr Trp Arg Arg Thr CXy Ser Ser Sex Ser XXe Leu Ser AXe 
2566 2670 2575 

Ser Ser CXu Ser Ser CXu Lya AXa Lya Ser Clu Aap Clu Lya Hie Val 
25S0 25S5 2590 

Aan Ser lie ser Cly Thr Lya Cln Ser Lya Clu Aan Cln val Ser Ala 
2S95 2600 2605 

Lya Cly Thr Trp Arg Lya lie Lya Clu Aan CXu Phe Ser Pro Thr Aen 
26X0 26X5 2620 

Ser Thr ser CXn Thr Val Ser ser Cly Ala Thr Aan Cly Ala Clu Ser 
2625 2630 2635 2640 

Lya Thr Leu ZXe Tyr CXn Net AXa Pro AXa VaX Ser Lya Thr CXu Aap 
2645 2650 . 2655 

VAX Trp val Arg XXa Clu Aap eye Pro lie Aan Asa Pro Arg Ser Cly 
2660 2665 2670 

Arg Ser Pro Thr CXy Aan Thr Pro Pro VaX He aap Ser VaX Ser CXc 
2675 2680 2685 

Lya Ala Aan Pre Aan XXe Lya Aap Ser Lye Aep Aan Cln Ala Lya Cln 
2690 2695 2700 

Aan VaX CXy Aan Cly Ser Val Pro Met Arg Thr Val Cly Leu Clu Aan 
2705 27X0 27X5 2720 

Arg Leu Thr 5er Phe XXe CXn Val Aap AXa Pro Aap CXn Lya CXy Thr 
2725 2730 2735 

CXu XXe Lya Pro Cly CXn Aan Aan Pro VaX Pro VaX Ser OXu Thr Aan 
2740 2746 2760 



wo 92/U103 -« ranmmm 

cx« S.r m tt. V.1 CI- *r, rnr^o ,M S « S.r s« 

Ly . „u I"*, rro «„ cxr m va XX. *x. a « THr ,ro - 

2770 277 * 



... «. «. ;» «. * «' - }}}»"' - ~ ~ 555. 

«. - ■«•» - "° »- ■ A " m a* " 

Mp .„ „. », ». •« «» ... «» - « L " 

„, Hla «, »r »r ». »} «. ... ... 

<2) XKroWUTXOH FOR SSQ XP KO:3: 

(1) SEOOIKCC CHXRACTJRISTXC5I 
• mtt micltic icid 

IC) STRXNMDNSSS l doubit 
CD) TOPOLOGY: UntAT 

(ii) K01XC0U Ttftt cDWA 

(vl) ORXCXNAt SOORCEt 

(X) OWAMXSH: Homo ttpitni 

(vti) XKXEfiXXTT WWCIi 
1 fB) CLOSE* OPKtSX) 

(lx) Ftxmtx 

(A) HXKE/ttTt . 
(t) LOCXTXQK: 

(ici) SEQUIKCE OCSCRXfTXOHl XO H0t3: 

-~ ~» rr* ere TXT COC CCA CIA CCA XCA OCC CCC CCW CCC 

SSSSS K S S xu H m XX. ,ro eg CXy 

B = BSSK5: = SS2 = = S = SS 

<rm XXC Xec XCC TTC XTC OCT CTT OCT CTC XTC COX 

S SSSSS ZL £| S. »• «■ 6 * v « l 8 » 

50 " 
^ w «c TXC CK CTC TTC CCT TXT CCX CCC TCT CTC CTC TCC 240 



96 
144 
192 
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206 



336 



364 



432 



489 



528 



— ™ r -r% ttt ece TAC CCA CCC TAC ATC TCA ATT AAA CCT ATA 
£ B Z xS SI} JS S5 S5 S S «g U. -r 11. ly. *J. U. 

— ~~ iik £AA CAT CAT ACC CAC TCC CTC ACCTAC TCC CTA 

K E !2 i5 Si Si £ 25 js «■ *» u " ™ ft Trp V41 

- — ~«*» ATT CCT CAA TTC TTC TCT CAT ATC TTC CTC 

5 25 S J2 i5 5 S Si S f*. 5.r ». s u. w»* t.« 

jjj 120 *«» 

TCA TCC TTC CCC TTC TAC TAC ATC CTC AM TCT CCC TIC CTC TTC TCC 
I" TrJ Ph. Pro Ph. Tyr Tyr M« Uu ty. Cy. cly Ph. Uu U« Trp 
no u» 1,0 

*ee ILTO CCC CCC ACC CCT TCT AAT CCC CCT CM CTC CTC TAC AAC CCC 
gj H?t xti ?ro Er Fro Mr A.n Ciy AU CU U« Lou Tyr ty. Arc 

ATC ATC CCT CCT TTC TTC CTC AAC CAC CAC TCC CAC ATC CAC ACT CTC 
H. IU At, Fro Fh. Ph. Uu Ly. HI. Ciu tor Cin «.t A.p ;r V.l 
If g 270 *'» 

CTC AAC CAC CTT AAA CAC AAC TCC AAA CAC ACT CCA CAT CCC ATC ACT 576 
V.X ly. A.p l.u ly. A.p ly. S.r ly. Cin Thr AU A.p AU 11. Thr 
180 !•* 1,0 

AAA CAA CCC AAC AAA CCT ACC CTC AAT TTA CTC CCT CAA CAA AAC AAC 62* 
ly. clu AU ly. ly. AU Thr V.1 ami Uv Uu Cly Clu Clu ly. ly. 
195 200 205 

ACC ACC TAAACCACAC TAAACCAOAC TCCATCCAAA CTTCCTCCCC TCTCTCTACC 660 
S«r Thr 
3X0 

TTCCTACTCC ACCTTCATCT TATATTAOCG ACTCTGOXAT AATTATTTTA ATAATCTTCC 740 

CTTGGAAACA TTTTTC ACAT ATTAAA6ATT CCAATOTCTT CTAACTTTCT TTCCTTACTT 800 

TTACTCTCTA TAT AT AT ACC CACCACTTTA AACTTAATCC ACTCCCCACT CTCCACCTTT 860 

TTCCAAAATC TATTTTGCCT CTOCCTACCA AAASATCTAT CTTC?TATCC TCCACCAAAT *20 

ATAAACTTAA AATAAAATTA TATACCCCAC ACCCTCTOTA CTTTACTCCC CTCTCCCTCC 100 

ACSSATTTTC TCTCTACTTA CATTTACCRT AATCTTTATC CTTCTACTTC CTATAATCTA 1040 

CAATTTTATA TAATTCN^FA ATOTTTTTAA TCTATTT5TC CACATCTACA TATGGAAATC 1100 

TTACTCTCTC ACTACAKCAT CCATCATCCT CATOGOCACC CACCACCCCA ACCTTCTATC 1100 

TCTCATTTAT AACTTCTCTA CACTAACACC ACCTCCCAAA ACCTCCACCA ACCATTCTCC 1220 

TCCTCTCCTC TACTAAATAA TACTTTACCA AATAC6TCAT TAATATOCAA CTCAACAAAC 1260 

TCACAAATCA AATCCAATGO ACATTCCCCT CCTTCTTICC CTACTATATC CCATATCAA7 1340 

ACCACCATAC CTTTATAAAC CA6TTACTTA CTTAGTTACT CACTCTACTC ATAAATCCCC 1400 

AAATTTACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAC ACACACACAO MZ 
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«r *A«eTCAAT TCCCTCAAAA kCTACTAATA CTCTCTTATC TGCTAtAAAC 1*20 
«TC««T «,««=»* — ~ «— ~ 

~««Tei irccttxtct t cmWTM k»bti» >•« 
« BICIT c K wanKK «cctct«T «aJW« 

— - '™ ~^ 
„ /Wllxexcx ^Acccrrec erxetnwa awcagaotc actwtactc mo 

cxaaac«aa K-cocxtcrc tccahca* xcccaahtcs htatacacac 

„AACTACAC TAAAACAACT CTATAACTAA ACTAACAACA TTAAATATCC ACCeAOTACA 

©TATTTTTTA ABCCAAATAA AGA5CATTA6 CTCACCTT6A CHTAACAATC AGCTAA8ATC 2180 
ATKACAATGT CTCATCATCT HAAKAATATT AAACATATCA ATACTAACTC ACACTAtCAC 
NMCZAATAXA AIAtCCATCA CACCATTTAT TTTCCCCAQO AAAACA6T60 TOATTACCCC 
CATTTTATXA AACtTAAAAC TUCIAOAAA CCAAACAAAA TTCTTCTTCC CACAAAAtCA 

ACTTTTACAT TAAAAAAATT «AACTAWCT ACCACTATTT AAATCCTTTT CCCATAAATA 2*20 

AAASTACA0? tTTCITCCTC GCAOAATOAA AATCACCAAC KTCTAOCATA TACACTATAT 2U0 

AATCAGATTC ACACCATATA OAATATA«A TCACACAACA lOACOAOOTA CAAAACTTAC 2840 

tATTCCTCAT AATCACTTAC A6CCXAAAAH TACKTMTAAA ATACIAXAXT AAATTCTCAA 2800 

TCCAATTTTT TTTTCTTCCC TtCACACCAA AATTTAACTT AACTCTTCCT CCCA57CTAA «*0 

CTCTAAATOT tAACAOCACC A6AACTTAA6 AAIXCAOCAC TTCTOttGCA TCAttTCCCA 2720 

XATCAAATAC TCCCHCCCT ACACTTTCAA AAAOAATIG ACCCWTCCC TGCCIACAAA 27.0 

ACAACCCTTT ATTTCAATG? CAATA6TCTT TCAAAOCtAT Cf ACTTACAS AATTCCTACC 2840 

AAACAO«TA AATTCTTCAA CAAAOAAWC CICCAOCAOT TATTCCCTTA CCTCAACCCT 2800 

TCAATCATZT G6ATCAACAA C7CCTACTCT C0GCAACAT CCTCTACTCA CAGCICAACA 2880 

AAATCACCAC ACCCTTCXCA CTCTTATCAC CtATCCtCAA CATCTCAtAC ACICAAtCCA 3020 
AATAAATAOA tCTAAAtAAA ATTOAGWTCT CAIttAAAAA AAACCATOTC CCCAAIt^CA 
AAATCACCTC ATC«CtCCT TTAAACAGCA ACTGCACCCA CTAGCACAGG CCATTCACCT 
ANCCTATATA lACATCteiC TCACTCCCCC TC 

( 2) wrowaTioH FOR «« i» "o* 4: 



2240 
2200 
23(0 



3080 
3140 
3172 
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(i) sequence cxaaactemstxcsi 

(A) LENGTH: 210 amino acldt 
(I) Wt: amino acid 
(D) topology i linear 

Hi) MOLECULE TYPE; protein 

(xl) SEQUENCE DESCRIPTION: SEQ 20 HO: 4: 

Ala val Ala Ale Pro val Tyr Pro Ala Uu Cly Thr Ala Pro Cly Cly 
1 S 10 15 

Clu Thr val Pro Ala Met Ser Ala Ala Net Arg Clu Arg Pht Aep Arg 
20 38 30 

Phe Lau Mia clu Lye Aen Cya Hat Thr At* Lag Leu Ala Lye Lau Clu 
JS 40 45 

Ala Lye Thr Cly Val Aen Arg Ser Phe lie Ala Uu Cly Val lie Cly 
SO SS 60 

Leu val Ala Leu Tyr Leu val Pha cly Tyr Cly Ala Ser Ley Leu Cya 
65 70 7$ eo 

Aan Leu Xle Cly Pha Cly Tyr Pre Ala Tyr tie Ser lie Lyi Alt Zle 
65 90 *S 

Clu Ser Pro Aen Lye Glu Aap Aep Thr Gin Trp Leu Thr Tyr Trp Val 
100 105 110 

Val Tyr Cly V*l pha Sar 21a AU Clu Ph» Pha Sar Aap Xle Pha Leu 

US 120 12S 

Ser Trp Phe Pro Pha Tyr Tyr Net Leu Lye Cy» Gly Pha Leu Lau Trp 
130 125 140 

Cya Net Ala Pro Ser Pro Ser Asn Cly Ala Clu Leu Leu Tyr Lye Arg 
14S 150 155 ISO 

Ila Xle Arg Pro Pha Pha Lau Lyt Hia Clu Sar Gin Mat Atp Sar Val 

155 170 175 

Val Lye Aep Leu Lye Aep Lye Ser Lye Clu Thr Ala Aap Ala Xle Thr 
ISO 1SS ltO 

Lye Clu Ala Lye Lye Ala Thr Val Aen Leu Leu Gly Clu Clu Lye Lya 
1SS 200 205 

Ser Thr 

210 

(2) IKrORHATXON fOR SEQ 10 NOiSi 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH i 434 amino aeide 

IB) TXPts amino acid 

(C) STAANOEOKESS* e ingle 

(0) topology i linear 

Hi) MOLECULE TYPE: protein 
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<vi) OAXCXMAL WURCCi m0kimmm 
1 (A) OUCAMXSM: Memo itpi««i 

(irii) IMMEDIATE SOURCI: 
1 (B) CLONE i TBI 



<xi) SEQUENCE DESCRIPTION! SEC ™ *<>'*' 

t ser CXy Arg AX* Pro Arg Mis Pre 
X0 1* 



40 4S 

CXy Ser A«p Leu CXy Bie Trp VeX Thr Thr 
ss ™ 

Ser Arg Aen tou Kit Trp CXy CXu Lys Mr 
75 BO 

Thr Thr Ser Thr Pro Tyr CXu CXy Pro Thr 
90 « 

CXy CXy CXy CXy Ser V*; CXn CXy CXn ser 
101 XXO 

Phe AX* CXy Phe CXy XXo CXy tw AX* Ser 

120 xas 

Leu AXe His Pro Cyf XXe v*X lou Arg Arg 
X25 1*0 

Hie AXe CXn Hie Tyr Hi* Leu Thr Pro Phe 
XS5 1*0 

Tyr Ser Phe Asn lye Thr CXn CXy Pro Arg 
1 X70 175 

CXy Ser Thr Phe XXe VeX CXn CXy VeX Thr 
XS5 190 

2Xe Ser CXu Phe Thr Pro Leu Pro Arg CXu 
200 205 

Pro Lye CXn XXe CXy CXu Hie Leu Leu Leu 
2X5 220 

VeX AXe Met Pro Phe Tyr Ser AXe Ser Leu 

CXu XXe XXe Arg Mp Aen Thr CXy XXe Leu 
250 255 

XXe CXy Arg Val XXe CXy Met CXy V*X Pro 
265 270 



VeX 
X 


AXe 


Pro 


Vel vel 
5 


VeX 


Pro 


Ale 


AXe 


Met Hit 
20 


Pro 


Tvr 


Arg 


CXy 
25 


CXy 


AXe 


Arg 


AXe 


Arg 
SO 


ser 


Fh« ser 


Thr 


Pro 
65 


pro 


Asp 


IXe Pro 


civ 
uiy 

70 


Pro 


Pro 


Tyr 


CXy VeX 
85 




CXu 


CXu 


Pro 


Phe 

100 


Ser 


Ser 


ser 


CXu 


CXn 
XXI 


Leu 


Aen 


Arg 


Lem 


phe 
ISO 


Thr 


Clu 


Aen 


V*X 


win 

X4S 


Cve 


CXn 


vel 


Aen 


Tyr 
150 


Thr 


VeX 


XXe 


Aen 


XXe 

145 


Met 


AXe 


Leu 


Trp 


Lye 
XB0 


CXy 


Met 


Leu 


CXy 


AXe 

Xf5 


CXu 


CXy 


XXe 


val 


Leu 

2X0 


His 


ty« 


Trp 


ser 


X-ye 
225 


Ser 


Leu 


Thr 


Tyr 


VeX 
230 


Xlo 


CXu 


Thr 


V*l 


CXn 
245 


Ser 


CXu 


eye 


VeX 


Lye 


CXu 


CXy 
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Kit 


far 


Lyt 

27S 


xrg Ltu 


T^n tre Lau Lau Sar Ltu lit Pht Pro Thr Vtl 
280 385 


Ltu 


Hit 

290 


Ciy 


Vtl Ltu 


u(« fur He tla sar Ser Val lit Gin Lyt Phe 

:5s aeo- 


Vtl 

30S 


Ltu 


LtU 


Xlt Uu 


Lyt Xrc Lyt Thr Tyr Atn ttr »*■ *#*u mi v*u 
3iC *1* 320 


str 


Thr 


5«r 


Pro Vti 
32S 


Gin Str Mtt Ltu Xtp Ala Tyr rut rro exu **«u 
330 


Xlt 


Xlt 


Atn 


Pht Xlt 
340 


Xlt Str Ltu Cyt Str Xtp v*l Jit Ltu xyr rro 
345 3$0 


Ltu 


Glu 


Thr 
3SS 


V4l Ltu 


Hit Jug Ltu Hit Xlt Gin Gly Thr Arfl Thr Ait 
3*0 36$ 


Xlt 


xtp 

370 


Xtn 


Thr Xtp 


Uu Gly Tyr Glu Val fv fro Xlt Atn Thr Gin 
J7* MO 


Tyr 
38S 


Glu 


Gly 


Ktt Krg 


Xtp Cyt Zlt Atn Thr lit at? win e*u viv way 
390 395 400 


vtl 


Nit 


Gly 


Pht Tyr 
40S 


Lyt Gly Pht Gly Xlt Val lit Zlt Gin Tyr Thr 
410 415 


Ltm 


Hit 


Xlt 


Xlt v*i 
420 


Ltu Gin Zlt Thr Lyt lit Zlt Tyr str Thr Ltu 
42S 430 


L*u 


Gin 









(2) ZHrORKXTZOH POX SSQ ZP NOitf 

(i) StQUtNCT CHAAACTSAXSTXCS* 

(A) LENGTH i 16$ tmino aeida 
1 8) TTPti tain© teid 

(C) STRXHDtDNESSi tinglt 

(D) TOPOLOGY t lintar 

(ii) K01XC0LT TTPti prottin 

(vi) original semes s 

(X) ORCXHXSMi Nome tapltnt 

(vii) IMMEDIATE SOURCE; 

(8) GLOHSi Yt-39(TS2) 



(xi) SEQUENCE OESCRIPTXONi *EQ ZC NOtC: 

Glu Ltu Xrg Xrg Pht Atp Xrg Pht Ltu Nit Glu Lyt Xtn Cyt Met Thr 

X 5 10 IS 

Xtp Ltu Ltu Alt Lyt Lau Glu AU Lyt Thr Gly Vtl Atn krg Str Pht 
20 25 30 

Zlt Xlt Ltu Gly Val Zlt Gly Ltu Vtl Xlt Ltu Tyr Ltu Vtl Pht Gly 
35 40 4$ 
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Tyr Clr Mc «« U. Cy. urn U» XI. Cly Ph. Cly tyr Fro Al. 
SO " 

tyr IU Ut II. ty. *!• «" * n J. - " ty- 61,1 A,P MP 55* 

t \ 70 

CXn Trp U. fM gr Trp V.1 V.l Tyr Cly v.l Ph. S.r XI. Al. Clu 

,h. Ph. S.r A.p XI. Ph. tm S«r tro Ph. Pro Ph. tyr tyr XI. t.« 

X, y . cy. Sly Ph. Uu Uu Trp Cy. Mt XI. Pro P«r Pro *«r A« Cly 

AU ei« Uu Uu ty, ty. Arc XI. XI. Arg Pro Ph. Ph. Uu ty. Ml. 
130 *J» 

Clu ».r cm Mt A*p «.r v.l v*l ty. ».p Uu ty. A.p ty. AL ty. 

j45 ISO 

Civ Thr Al. A.p Al. II. Thr ty. Clu Alt ty. ty. Alt Thr V.l A.n 

170 

Uu Uu Oly Clu Clu X*ys Lys Sar Thr 
180 

(2) INFORMATION rOR SEQ 10 HOfl: 

(il SEQtTENCE CHARACTERISTIC* I 

(A) LZKCTHi 2S42 aaine acids 
(1) TTPEi aaln© acid 
(C) 8TRAKBS0KC85 I ainQla 
(0) TOPOLOCXt linear 

(11) MOLECULE TTWi protain 

(Vi) ORIGINAL SOURCE: 

(A) ORCAKXSMi Homo aapians 

(vii) IMMEDIATE SOURCE: 

(B) CLOMEl APC 



(Si) SEQUENCE DESCRIPTION SEQ ZO K0t7t 

Met Ala Ala Ala sar Tyr Aap Cln Uu Uu X#« Cln vai Clu Ala Uu 
1 S 10 15 

Lys Mat Clu Aan Sar Aan Uu Arg Cln Clu Uu Olu Af? Aan Sar Atn 
20 25 30 

Kia Uu Thr Lya Uu Cly Thr Clu Ala Sar Atn Mat Lya Olu Val Uu 
39 40 45 

Lya Cln Uu Cln Cly Sar Ila Clu Aap Clu Ala Met Ala Sar Sar Cly 
T SO 55 SO 

Cln Ila Asp Uu Uu Clu Arg Uu Lyt Clu Uu Ran Uu Atp Sar Sar 
CS 70 K *° 



WOW/13103 75 _ PCT/US92/O0J* 

Mn p M Pre Cly V.J ty. Uu *1 *« ty> M«t C.r U« **9 «jr Tyr 

Cly s« Hr, Ci« Cly »« vtl S.r tar *r« «.r QXy OXu Cy. $.r Fro 

100 . 

vol Pro Not Gly Sor Ph. Pro Arg Arg Cly Pho v*l Mj Cly Sot Arg 

H$ 120 *** 

Glu $or Thr Gly Tyr Lou Clu Clu Loo 61w lyi Clu Arg Sor Uu Lou 
130 149 

Lou Alo Aop Uu Aop Lyo Glu Glu Lyi Clu Ly; Mp Trp Tyr Tyr Alo 
145 ISO I'' 

Cln Lou Gin Aon Uu Thr Lyo Arg Xlo Aop Sor Uu uu Thr Ciu Aon 

)(| 170 170 

Pho Sor Uu cm Thr Aop Mot Thr Arg Arg Clu Uu Clu Tyr Glu Alo 
180 1,0 

Arg Gin Xlo Arg Vol Alo Hot Clu Clu Cln Lou Cly Thr Cyo Clu Aop 
191 200 205 

Mot Clu Lyi Arg Alo Cln Arg Arg Xlo AU Arg Xlo Gin Gin Xlo Clu 
210 ' 3 IS 220 

Lyi Aop 21o Uu Arg llo Arg Cln Uu Uu Cln Smz Gin Alo Thr Olu 
226 220 23S 240 

AU Clu Arg sor Sor Cln Am Lyi Hii Clu Thr Cly Sor Hio Aip Alt 
245 2S0 255 

Clu Arg Cln Aon Clu Cly Cln Gly Vol Gly Clu Xlo Aon Mot AU Thr 
240 265 270 

Sor Cly Aon Cly Cln Cly Sor Thr Thr Arg Mot Aip Hio Clu Thr Alo 
275 280 215 

Oor vol Uu Sor Sor sor sor Thr Bli Sor Alo Pro Arg Arg Uu Thr 
290 29S 300 

Sor Hii Uu Cly Thr Lyi Vtl Clu Mot Vol Tyr Sor Uu Lou sor Mot 
305 * 310 315 320 

Lou Cly Thr Hio Aop Lyi Aip Aip Mot Sor Arg Thr Lou Uu Alo Mot 
325 330 335 

sor sor sor cln nop Sor Cyi Xlo Sor Mot Arg Cln sor cly Cyo uu 

340 345 3S0 

Pro Lou Uu Xlo Cln Lou Leu Hio Gly Aon Aip Lyi Aop Sot v*l L.u 
35$ 360 365 

Lou Cly Aon sor Arg Oly Sor Lyo Clu Alo Arg Alo Arg Alo Sor hio 
370 37$ 300 

Alo Luu Hii Aon Xlo Xlo Hii Sor Cln Pro Aop Aop Lyo Arg Cly Arg 
305 300 39$ 400 

nrg Clu llo Arg Vol Lou Hio Lou Lou Clu Cln Xlo krg hio Tyr Cyi 
405 410 415 
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425 

fro Vtl Clu Hit Cln lit Cyi fro Alt 
440 "5 

Sac the AiP ClU Clu Mlf Arg Mi. AlA 
460 

Cln A 1a 21a Alt Clu Uu Liu Cln VaI 
47$ 4*0 

thr Aaa Atp KtA Tyr Str Xlc Thr Uv 
490 49* 

Uu TAT Aaa Uu Thr 9hA Cly AAp VaI 
SOS 

Str MAt LyA Cly CyA MAt Arg A1a Uu 
W 0 

Str Clu Atp Uu Cln Gift VaI 21a A1a 
S40 

Trp Arg AlA AAp vaI AAn SAr Lyt LyA 

sss *«o 

VaI tyt AlA Lau Hmt Cly CyA A1a Lau 
570 975 

Lau Lyt 5«r VaI Lau Str A1a Law Trp 
ses 590 

Clu AAA Lyi AlA AAP XlA Cyi AlA Vtl 
600 

VaI Cly Thr Uu Thr Tyr Arg SAr Cln 
620 

Clu SAr Cly Cly Cly 21a Lau Arg Atn 
625 640 

Aaa Clu Atp Kit Arg Cln 21a Uu Arg 
650 

Lau Uu Cln Kit Uu Lyt Sat Hit SAr 
€65 «70 

Cyi Cly Thr Lau Trp Atn Uu SAr Alt 
680 MS 

Alt Uu Trp Ai? Kit Cly AlA VAl Scr 

7CC 

SAr LyA MiA Lyi MAt XXa AIa Met Cly 

Lau MAt AIa Aaa Arg Pro AIa Lyi Tyr 

730 735 



Clu 


Thr 


Cyi 


trp 
420 


Clu 


Trp 


Gin 


Atp 


ty« 


Ain 
435 


Pro 


Mtt 


Pro 


Alt 


VAX 


cyA 
450 


VAl 


LAU 


MAt 


Lyt 


Uu 
4SS 


KAt 

465 


AAA 


Clu 


Uu 


cly 


cly 
470 


Uu 


AAp 


CyA 


Clu 


MAt 


Tyr 

485 


cly 


Uu 


Arg 


Arg 


Tyr 


Alt 

500 


cly 


KAt 


Alt 


AlA 


AAn 


Lyt 
SIS 


Alt 


Thr 


Uu 


cyi 


VaI 


AlA 

530 


Cln 


Uu 


LyA 


SAr 


Clu 
535 


««r 

545 


VaI 


Uu 


Arg 


AAA 


Uu 
550 


ser 


Thr 


Uu 


Arg 


Clu 


Vtl 

565 


Oly 


Str 


Olu 


VAl 


Lyi 


Lyt 
seo 


Clu 


Str 


The 


AAA 


UU 


Str 

S9S 


Alt 


Hit 


cyA 


Tht 


AAP 


cly 

610 


Alt 


Uu 


Alt 


PhA 


Uu 
615 


Thr 
625 


AAn 


Thr 


Uu 


Alt 


XlA 

630 


XlA 


VAl 


SAr 


SAr 


Lau 


XlA 

645 


AlA 


Thr 


Clu 


AAA 


AAn 


cyt 

660 


Uu 


Gin 


Thr 


UU 


Thr 


lit 
675 


VAl 


SAr 


Am 


Alt 


Arg 


AAn 
690 


Pr© 


Lyt 


Atp 


Gin 


Clu 
695 


Met 
70S 


Leu 


Lyt 


Atn 


Uu 


XlA 

710 


Kit 


Ser 


AlA 


Alt 


Alt 


Uu 
725 


Arg 


Atn 


LyA 


AAp 


Alt 


Atn 

740 


ZlA 


KAt 


Ser 
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v.l Ar 9 ty. «n ty. *U f» 61- U. *.p Al. Ht. f« 

7SS ,w 

,.r Olu Thr W A.p »• *»P Uu *" ljr " Al4 ,€f ,l- 
770 71 s 

Ar* S.r Ly. «. Hi. ty. «r s.r u. Tyr Oly A.p Tyr V.1 tjj 

7§5 "SO 

Mp Thr Mr, Ac? Hi. A.p A.p Mn Ar, S.r Mp Ma Fh. A.n Tjr Cly 

K.n M«t Thr v«l f« «.r >re Tyr t.u A.n Thr Thr V.l Uu Pro ««r 
§20 •«» 

S.r s.r s.r S.r Ar 9 «y Mr Uu A.p s.r S.r Ar 9 S«r «« ty. A.p 
935 040 •« 

Arg S.r Uu Clu Xrg Clu Xrg Cly XL Cly Uu Cly Atn Tyr Hit Fro 
850 160 

AU Thr Clu A.n no Cly Thr S.r S.r ty. Xrg Cly Uu Clu lit S.r 
.65 170 

Thr Thr Al* Al. Cin XI. XI. ly. V.l Mot Clu Cly V.l S.r XU XU 

.as sw 

Hi. Thr s.r cin clu x.p xrg Ser S.r Cly S.r Thr Thr CU Uu Hi. 
900 MS 910 

ey. v.l Thr Atp Clu Xrg Atn XI. Uu Xrg xrg S.r S.r XI. XI. Mi. 
915 920 925 

Thr Hi. s.r A.n Thr Tyr Am Ph. Thr tyt s.r Clu Xtn s.r A.n Xrg 
930 935 * 40 

Thr Cy. s«r K.t Pre Tyr XI. ty. Uu Clu Tyr tyt Xrg S.r S.r X.n 
945 9S0 95$ 

a.p s.r Uu X.n S.r V.1 S.r s.r S.r Atp Cly Tyr Cly ty. Arg Cly 
9(5 970 97$ 

Cin Met Lyi Pro S.r lit Clu S.r Tyr s«r Clu X.p Atp Clu S.r ty. 
9$0 MS 990 

Ph. cy. S.r Tyr Cly Cin Tyr Pro XU X.p Uu Al* Mi. ty. XU Hi. 
995 1000 100S 

Smz XI* Xin Hi. M.t X.p Atp A.n Atp Cly Clu Uu Asp Thr Pro XU 
1010 101S 1020 

Aan Tyr Str Uu ty. Tyr S.r A.p Clu Cin Uu A.n S.r Oly Xrg Cin 
X02S 1°*° ms i040 

S.r Pro S.r Cin Atn Clu Arg Trp Al. Arg Pro tyt Hi. XI. XU Clu 
104$ 1010 1095 

X.P Clu XI. ty. Cin s«r Clu cin Xrg Oln s.r Xrg Xtn Cin S.r Thr 
1060 104S 2070 

Thr Tyr Pro v.l Tyr Thr clu s«r Thr X.p Xip tyt Hi. Uu ty. Ph. 

1075 1080 108S 
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... ,»„ am Gin Clu cy« v*l Mr Fro Trr Arg Mr Arc 
Gin Fro Hi. Fho «y «» *i n s c * u T 1100 
1090 iOTS 

«, «. ».» «» «« »" «• a*"* "» 

r.u **> ... s «. ~ *■ !!;,•" >,r T " »» " 

*v* ser Clu Clu Cl« Cln Hi* 
A.p ty. Fro Thr A.« Tyr Mr Clu Arc Tyr Mr c- «u jw 

1140 **** 
cu ou ciu oi. at, Pro Thr M-ryr Mr II. ty. Jyr $ A.. 01- Clu 

1135 1100 
ty . u, Hi. v.l A. P 01. Fro 11. A.p Tyr ..r Mm Tyr Al. Thr 

1170 " 7 * 
MP 11. Fro ,.r ..r 01.*. CI. Mr FN Mr $ Ph. Mr if J- 



1185 



ciy cm Mr Mr*. Thr «u Hi. jgfrr Mr «.r Mr oua.. 
Thr .or Thr Fro Mr Mr A.. Al. ty. $ Aro Cln A.n 01. gMU. »ro 
5 „ ,.r Almoin Mr Ar, S.r GljrCl. Pro 01. ty. Ai. $ Al. Th, Cy. 



ly . v.1 s« s« XI. A" «> 01. thr II. 01. Thr Tyr Cy. v.l Clu 

1250 1255 
A^Thr Fro XI. Cy. Ph. Mr Arg Cy. «.r Mr-- s.r s.r U. S^ 

s.r Al. Clu A.p Clu.Il. Ciy Cy. A.» 01»Thr Thr Cln Clu AMM, 



1305 



ser Al. A.. Thr t.u 01. XI. Al. Clu XI. ty. clu Ly. UoOly Thr 
1300 ww * 



ins 

,ro Arg Thr ty. «« Mr Aro u. 01. ciy «.r MrU. s.r Mr Clu 



ar 9 Mr AU Clu A.p Fro V.l MrMu V.l Fro Al- V.l M, Cln Ox. 

Xro L.u Cln Ciy Mr Mr I 
13)0 W4 ° 

s.r Al. Aro IU Al* v *l Olu Fh. ..r Mr Ciy Al. ty. Sor Pr^ 
134S UM 

s.r ty. S.r ciy AU $ Cin Thr Fro ty. M^Pro Fro clu Hi. Tyr $ v.l 

01. 01. Thr Prot.u H.t Fh. Mr Mg*. Thr Mr V.l Mr Mr fu 
Asp s.r PM «u Mr Ar, Mr jMAl. Mr ..r vol CloMr Clu Fro 



1395 



Cy. SorCly «t v.l Mr gjU. "« *• »« 0 *"> l «" Pw *" 
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,.r Pro Cly Cl« Thr wt Pro t,o *•* **« Jor^y. Thr rro *ro PrO Q 



J42S 



Pro Fro Cln Thr M. CU Thr ty. **« Clw v*l fro i y< Atn ty. Al. 

1445 1450 _ 

Pro Thr Alt Clu Ly. Arg Clu S«r Cly Pro Ly. Alt Alt Vol A.n 
1460 1<« 1470 

Ale Alt vel Cln Arg Vel Cln Vol Ltu Pro Atp Alt Atp Thr Leu Leu 
147$ 1480 1485 

Hit Pfto Ait Thr clu Str Thr Pro Atp Cly Pho IJJ Cyi sor $«r Ser 
1490 149S 1500 

Lou Ser Alt Lou S«r Lou Aop Clu Pro Pho Xlo Cln Ly. Aop Vel Clu 
1505 1510 1515 1520 

Lou Arg Xlo Mot Pro Pro Vol Cln Clu Atn Atp AiA Cly Atn Clu Thr 
XS25 1530 1535 

Clu Sor Clu Cln Pro Lyt Clu Cor Atn Clu A.n Cln Clu Ly; Clu Alo 
1540 1545 1550 

Clu Lyt Thr Xlo Atp tor Clu Lyt Atp Lou Lou Aop Atp Str Aop Aop 
1 1555 1560 1565 

aop »tp Xlo Glu Xlt Ltu Clu Clu Cyt Xlo Xlo Sor Alt M«t Pro Thr 
r 1570 1575 1580 

Lyt Sor Sor Arg Lyt Alt Lyt Lyt Pro Alt Cln Thr Alt Sor Lyt Lou 
1585 15« 1*" 1$0 ° 

Pro Pro Pro Vol Alt Arg Lyt Pro Sor Cln Lou Pro Vol Tyr Lyt Lou 
1605 1510 1615 

Leu Pro Sor Cln Atn Arg Lou cln Pro Cln Lyt Hit Vol Str Pho Thr 
1620 1525 1630 

Pro Cly Atp Atp Mtt Pro Arg Vol Tyr Cyt Vol Clu Cly Thr Pro Xlo 
1635 1540 1545 

Aon Pho sor Thr Alt Thr sor Ltu Sor Aop Lou Thr xlo clu sor pro 
1650 1555 1560 

Pro Aon clu Lou Alt Alt Cly Clu Cly Vol Arg Cly Cly Alt Cln Sor 
1555 1670 1671 1680 

Cly Clu Pho Clr Lyt Arg A.p Thr Xlt Pro Tnr Clu Cly Arg ser Thr 
1685 1550 169S 

ASP Clu Alt Cln Cly Cly Lyt Thr Sor Sor Vel Thr Xlt Pro Clu Leu 
1700 1705 '1710 

Atp Atp Atn Lyt Alt Olu Clu Cly Atp Xlt Ltu Alt Clu Cyt Xlo Atn 

. 1715 1720 1725 

Sor Alt Met Pro Lyt Cly Lyt Sor Hit Lyt Pro Pho Arg v.l Lyt Lyt 
1730 1735 1740 

lie Met Atp Oln Vel Oln Cln Alt Ser Ai« Ser s#r Ser AU Pro A.n 
1745 1750 1755 1760 



WO 92/13.03 . 60 _ PCT/US92/00376 

t y. x.n ex. u. x.p $ cxy Ly. tt- ft™ tnr s.r pro JUty. 
, r e ,X. Pro CI JL Thr clu xyr ^Thr xr, v.: xr 9 LyoMn XX. 



ii4 Clu Ax* V4I >h« Scr Aip iyt 
" P 5 " SS$ Xift WOO MO* 

. n« i«i Lftu Lyi X«ft At* s#r Lyi A«p Fh« Xtn At? 

A«p ft Lyt ty§ Oln Ain u«V * 1M0 
1610 m> 

Ly. tou FTO X.a X.* CX^X-p Xrc V.1 -f «• ~ $0 



182* 



S.r Pre HI. Hi. Tyrm Pro XX. CX« cjy^hr Pro Tyr Cy. Pjo $ S.r 
xr, X.n Afp WW 3« Scr »» X.P $ Ph. X. P X.p A.p MpVxX X.p 



1860 



V» S.r xr, cXu Ly. XX. cx« lju xr, Ly. XX. Ly. CjjMn Ly. «. 
XS« i»Bg 



V.X Thr S.r HI. Thr CXu L.u Thr S.r X.n CXn CXn 



S.r CXu XX. Ly. v.X xnr ... ™ 0 



1090 



S.r XX. X.n ty. Thr CXn XX. XX. XX. Ly. CXh Pro XX. X.n Xr, g fc 
1905 1,10 

CXn Pro Ly. Pro XULoo «X« ty. «X« s.rthr Ph. Pro CXn ^ 

tf . A.p IX. Pro X.P Xr 9 CXy XX. XX. Thr X.p OX„ Ly. L«CXn X.n 
1940 

Ph. XX. XUCXo X.« Thr Pro V.X*. Ph. S.r Hi. X.« $ S.r Ser »« 

S ., ser U. *« X.p XX. MP CXn CXu X.« X.n X.n Ly. Clu X.» CX« 

1970 l97 » 
Pro s XX. Ly. CX. Thr cj^Pro Pro X. P ..r juai, CX- Pro sor fa 

p!! S cx» XX. S.r cXy s Tyr XX. Pro Ly. Wh. Hi. v.X CX„ x.p»r 



Fro 



v.X Cy. Ph. -r Xr, X.n S.r Mr U- S.r S.r L.« WXe X.p 



2030 



s.r clu X.p X.p Lou Lou OXn CU Cy. XX. S.r ..r XU Hot Pro Ly. 
3035 2040 

Ly. Ly.>y. Pro s.r Xrp g. «f CXy x.p x.« CX^Ly. H.. s.r Pro 



20SO 



m X.n M.t CXy CXy IUV» CXy Clu A.p LwThr L.u X.p L.u Ly. o 



206S 



X.p XX. CX. »r, P«X.p S.r CXu Hi. CXy Uu S.r Pro *. F 
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A.» Ph. MP trp^y. xl. II. CXn CX^oXy XX. mo Mr XX.v.1 Mr 

s„ Lou Hl. $ CX. XI. XX. XU ey. Uu S.r A^CXn AX. Mr 

S.r MP Mr X.p Mr XL £ Mr ^ Ly. »« ClyU. Mr U. Cly 

,.t Pro Ph. Hi. t.« Thr Pre Mp Cln CXu clu Ly. fro Ph. Thr M^ 

214$ 21S0 **" 

A.n Ly. CXy Pro Ar 0j XX. Low Ly. P« CX^CXv Ly. S.r Thr go «« 

Thr Ly. Ly. XX^CXu Mr CX« S.r Ly. «y XX. Ly. CXy CXyLy. Ly. 

v*X Tyr Ly. s.r U» XX. Thr CXy Ly. v.X Arc Mr MB Mr Clo XX. 
2195 2200 *«» 

s.r CXy CXn M.t Ly. 0I» Pro Low CXn XX. A.n «« Ho Mr IX. Mr 
22X0 

CXy Aro Thr *.* XX.^Hi. XX. Pro CXy V.^Aro Mo s.r Mr M^ 



2225 

S.r Thr s.r Pro V.X Mr Ly. Ly. CXy Pro Pro Lo« Ly. Thr »ro AX. 
2248 22SC 

Mr Ly. s«r Pro «.r CXu Cly CXn Thr XX. Thr Thr Mr Jrotoj CXy 
1 22J0 2X»* 2370 

AX. Ly. Pro Mr v.X Ly. Mr CXu L.u Ser Pro V.X XX. Ar? CX« Thr 
2275 2280 22SS 

Mr CXn XX. CXy CXy Mr * u tn ,-r 5j| e * ar Cly **' *** 

A.p s.r Thr Pro Mr Me Pro AX. CX« CX» Pre Lou Ser Ar« Pro XX. 

2310 2J1S *j*u 

Cln Ser Pro CXy Arg Aen Ser Xle Ser Pro Cly Arc Aon Cly Uo Mr 
2325 23J0 2JJ5 

Pro Pro ma Lye Lou ser Cln Leu Pro Are Thr Ser fox Pro Mr Thr 

2340 2345 2J50 

Ale Ser Thr Ly. Ser Ser Cly Ser Cly Lyi Htx ser Tyr Thr ser Pro 
2355 2340 2365 

civ Are Cln net ser Cln Cln Aen Leu Thr Lyi Cln Thr Cly Leu Ser 
2370 «7S 2,80 

Lye Aen Ale Ser ser lie Pro Aro Ser Ciu Ser Ale Ser Ly. Cly Leu 
23 ft 5 2390 2313 2400 

Aen Cln Met Aen Aen Cly Aen Cly Ale AM Ly. Ly. Vol Clu Leu Ser 
240$ 2410 2419 

Aro hot Ser Ser Tht Ly. Ser Ser Closer Clu ser A.p £| 0 *" clu 
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Axo Fro v.l U. v.l xr, Cln Mr Thr Ph. II. ty. Jg IU. Pro S.r 

243S 244C 
Pro Thr tou xr, Xr, ty. Uu Cl« Clu ,.r XL CI- S.r tou 



24S0 «W 

M r Pro Sor .or Ar, Pro AL S.r fro Tnr jro S.r Cln »U Cln m 
3465 2470 

Pro v.l Uu ,.r Jro $ 5or U. Pro x.p J.t«.r Uu S.r Thr HLSor 

s .r v.l Cln XI. Cly Cly trp hr 9 ty. Uw Pro Pro x.n to^S.r Pro 

2500 * 3W * 
thr XI. «» Tyr A.n *-P cly Xro Pro XL Ly. Ar, HI. A.p XI. Al. 

2S1S * 8W 
Ar, ,.r Hi. S«r Clu sor Pro S.r Ar, U. Pro XL x.n Ar, S.r Cly 

2$30 25,5 
Thr Trp ty. Arg 01. Hi. «.r ly. Hi. sor s.r S.r to- Pro Ar, VU 
2S4S 1850 

S.r Thr Trp Ar, h^Thr cly S.r Sor .or..r XI. L.« Sor AL Sor 

Sor Clu «.r ..r Clu ty. XU ty. Jjf Clu x.p Clu ly. Hi. Vol A.n 

2SB0 * 888 w,w 

sor XI. $« cly Thr ty. cm sor ly. Cl» x.n cm vol so* XI. ty. 

259S 2600 
Cly Th^Trp Xr, ty. XI. ty^clu x.n Clu Ph. S.rMO Thr x.n s.r 

,hr ..r Cln Thr Vol Sor s.r Cly XI. Thr Mn Cly XI. Clu Sor ty^ 
2625 2630 **** 

Thr too XL »yr Cln Mot XI. Pro XI. V.l *.r ty. Thr Clu x.^v.l 
2646 #6»w 

Trp v.1 Xro XI. CL X.P Cy. Pro XI. A. n A.n Pro Xr, Sor Cly Xr, 

S«r Pro Thr Cly X.n Thr Pro Pro V.l II. X.p S.r v.l S.r Clu ty. 
j$75 2660 "« 

Al. X.n Pro X.n XI. ty. x.p S.r ty. X.p X.n Cln Xl. ty. Cln X.n 

2690 7700 
v.l Cly X.n Cly S«r ^ Pro tot Xro Thr v.l^Cly fu Clu X.n Xr^ 



270S 



U« X.n S.T Ph. Xlj $ ein V.l A.p XI. Pro^X.p Cln ty. Cly Thr $ Clu 
XI. ty. Pro Cly Cln x.n x.n Pro v.l Pro vri s.r clu Thr x.n Clu 

27 4Q 2745 *#»v 

V 

2755 



st s«r IU Ml Clu xrg Thr Pr^Pha Str s.r s*r £^S« Set Lyt 
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Hl« s.r «.r Pro *.r Cly Thr V*X Wi AL Aro v*l Tttf fro Ph. A.n 
S770 371$ 2 80 

Tyr $ A.n Pro S.r Pro Aryty. «« *»• »* S " 

Pro s« cin tl. Pro Thr Pro Vol A.r. Thr X,y« ty. Ar, A. P 

2905 2**0 «e*» 

Mr ty. Thr A.o S«r Thr Ciu Sor f t Cly Thr Cin S.r Pre ty. Ar 9 
2320 

His ser Cly Ser Tyr Leu Vel Thr Ser Vel 
aels «4o 

(2) INFORMATION TO* StQ XO HO;li 

(i) SEQUENCE CHARACTERISTICS: 
(X) LIHCTH: 31 taino ecide 
<B) TVPEi Mine eeid 

(C) STRANDEDNXSS: tingle 

(D) TOPOLOOYt lineer 

(ii> molecule trwt peptide 
ivii) immediate source* 

(8) CLONE: r4l2(yeast) 
(Si) SEQUENCE DESCRIPTION: SEQ ID KOiIi 

L«u Thr Cly AU Lye Cly Uu Cin Leu Arg Alt Leu Arg Arg lie Alt 
1 » 10 15 

Arg lie Clu Cln Cly Cly Tnr Ale lie Ser Pro Thr Ser Pro Leu 
20 21 2° 

(2) INFORMATION FOR SEQ ID MO: 9: 

(i) SEQUENCE CHARACTERISTICS. 

(X) LENGTH. 29 Amino ecidi 
(I) TXFXi wine acid 

(cj strandedn«ss« iingie 

(0) TOPOlOCXi Untw 

(11) MOLECULE TYPE: peptide 

(Vi) ORIGINAL SOURCE. 

(X) ORGANISM: Ho*© sapient 

(vii) IMMEDIATE SOURCE: 

(1) CLONE: ©J(sAChR) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NOt9l 

Leu Tyr Trp Arg tie Tyr Lyt Clu Thr Clu Lyi Arg Thr Ly§ Clu Leu 
1 S 10 15 
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xl . ». r cly Thr CXu AU 61- 6l« 
xx« cly t*» «*« fcU Mr as 

iS S&SB." U 

(0| TOPOlOCtl 
(ii) KOWCVtX TTfts p«PtW« 

< vl > •SPSS? «-«- 

(Til) ikkioixti sower ■ 
2 xy! « - « xu o» ex, xr, ,„ ,„ «. «. g- - 

1 1 

_ clu X gn CXi» S« X*u Thr AX* M«t 
JU* cXy Ar« exu CXu Ain j$ 

(2 ) XKTOWttTXOM TOKSZQXD KOtXX; 
<£ j CEQOZNCX CHAMCTCTISTXCS; 
<l> (?) LIHCTHl 40 b«t PjiTi 

18) trnt nucXcie aei* 

|C STItAKDtDHMSt iingXe 

(0) TOPOtoCT: Xinwr 

(ii) HQIMOJU TYPE* COKA 

<*i) ««Q0WCt BESWIPTXOM: SW XD ^ 
CT&TCAACAC TCTOACTTT ^TICTAGTT TATCCXTSTT 

(2 ) xHrowaiicii roii seq xd ko.xj. 

Ill SEQOIHW CHARACTERISTIC** 

. ml* mtfl€ -eld 
(C 5TWDIDXMS: • J.WX* 
(0) TOfOtOO*! li»«»r 

|tl) MOtlCOlI TTPt« CDKX 

™ TxfoSxSs^ ..Pi- 
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<xl) SEQUENCE DESCRIPTION: SEQ I© 
TTTACAATTT CATGTTAATA TATTOTCTTC TTTTTAACAG 
<2) INFORMATION FOR «0 » 

111 SEQUENCE CHARACTERISTICS: 
1 | A) LENOTMt 40 bo« ptirs 

(B) TTW: BttcUiC «Cid 

(C) STRANDEDNISS: iinolt 
(0) TOPOLOGY « iin««r 

(il) MOLEOJXX TTW&: COMA 

ivi) oaicimal sotmat 

1 (A) ORGANISM* Mono llpitni 

(xl) SEQUENCE 0ESCRX?TXOM! SEQ ID M0:1J: 
CTACATTTTA AAAAGGTGTT TTAAAATAAT TTTTTAACCT 

<ai information roR seq » hoi 14 1 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 »«■« P*i*» 

(B) TTTE J nucloic Mid 

(C) STRANDEDNESSi •inglt 
(0) TOPOLOGY: linetr 

(ii) MOLECULE TYPE: cOMA 

(vl) ORXGXKAL SOURCES 

(A) ORGANISM: MoaO ttplBAB 

(Xi) SEQUENCE DESCRIPTION! SEQ ID NOi14i 
AAOCAATTGT TCTATAAAAA CTTCTTTCTA TTTTATTTAC 
(2) INFORMATION FOR SEQ ID NO:lS* 

(i) SEQUENCE CHARACTERISTICS J 

(A) LENGTH ■ 40 bast f*lr« 

(B) TYPE: nueltic «cid 

(C) STRANDEDNESSi siaf !• 
(0) TOPOLOGY! linoir 

(ii) MOLECULE TYPE: cDNA 

<Vi) ORIGINAL SOURCEi 

(A) ORGANISM: Homo ttp&«n« 

(xi) SEQUENCE DESCRIPTIONS SEQ ID MOilS: 
CTAACTTTTC TTCATATAGT AAACATTCCC TTGTCTACTC 
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. i . tteuiNec eHMUCiwxmci* 
J SpJotbhws. 

(ill KOtKCVLt «* s CDM * 

(Jt i, Sia uX«Ct OKCRJFIIOH. «3 XO HO.X6 

nnnw nnknn hhhctccctt tttttaaaa* *^»AC 

(J, IHWWaTION fOR HQ xo 

II (?) IXKCT8I 40 Mm Ml" 

B TTM. «-cX«U Mid 
(C) STBAKMDKtM: iinflX* 
(B) TOPOWWt iiMtr 

(iil kowcdw ttp*« ew* 

(«i) «oowet BKC»xr«oH, sbq xb war 

GTAACTAACT TCGCA6TACA XCTTATTTGA AACTT7AATA 

( a j iHrowtttioH to* a» x» w>*x*s 

(I) SEQOWCt CHA*AeTERXStXC*« 

III (A) XJCH6TH: 40 BM« P»iM 
( B) tm.- BaeX«ie *cid 
(CI BTWOWTOHtSSs «i»9X« 

(B) Toretoevt Um« 

(11) MOLSCUU Tttti eOKA 

(X) OWWISM: H0«o BtpUm 

(Hi) SEQUXNCS DMOIIPTXON. SCQ » KOsXS 
ATACAACATA TtOMACnt ITTATTATTT OTOCTTTTAC 
(2) INFORMATION FOR UQ I» «« 0s " 8 

111 CBQUXNCE CHAAACITMSTICS: 
(i) (A) UH8TM. 40 »«•• P»iM 

(B) n»i w»cl«ie ieW 

(C) STRAiretOKKSt •if»9X« 
(B) TOPOWCI: lln«»r 



40 



40 



40 



wo 92/i3io3 pcr/usw/oor* 

(ii) KOLECULE TYPE: COMA 

ivil ORIGINAL SOURCE: 

(A) ORGANISM: Homo itpUns 

(»i) SEQUENCE DESCRIPTION: 550 X0 HOiUt 
GTAACTTACT TCTTTCTAAC TGATAAAACA CYGAACACCT 40 
(2) IKfOAMATIOK fOR «0 «> ^0:20: 

111 SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 tMt pstrt 
(8) TTOi ©ucUic acid 
iC) STAANDRBHESSi •lA9i« 
<D) TOPOLOGY: lin«*r 

(ii) KOLECULE TYPE: CDKA 

(vi) ORXGXKAL SOURCE: 

(A) ORGANISM: HO«0 SSpiGftt 

(si) StQUENCE DESCRIPTION! SEQ ID NOtSO: 
AATAAAAACA TAACTAATTA CGTTTCTTCT TTTATTTTAO 40 
<2) INFORMATION FOR SCO X© NO:2X: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 b«M psirs 

(B) TYPE: aueUle acid 

(C) STRANDEDNESS: singlf 
(0) TOPOLOGY: linear 

(ii) KOLECULE TTPE* COKX 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Ho»o •apian* 

(Xi) SEQUENCE DESCRIPTION: SEQ IP HO*2i: 

CT7ACTAAAT TSCCTTTTTT CTTTGTGGCT AtAAAAATAO <© 

(2) INFORMATION FOR SEQ XO NO* 22: 

(X) SEQUENCE CHARACTER! STX CS I 
(A) UKOTHi 40 bate ptira 
<B) TYPE: nucleic acid 
(C) STRANDEDNESS: sinala 
(0) TOPOLOGY: Unssr 

(ii) MOLECULE TYPE: eONA 

(vi) OR2GXNAL SOURCE: 

(A) ORGANISM i Homo eapiana 
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<*i) StQUtHCl DESCRIPTION: SEQ ID HO»22i 

XCCATTTTTC CATCTACTOA TCTTAACTCC ATCTTAACAG 

(3 ) XHTORKATXON IW SEQ » KO S 2Jt 

(i) SEQUENCE CHARACTERISTICS; 
11 Ik) LENGTH* 40 b..e peiri 
B WFE: nucleie acid 

CC) STRANDtDNESSi tingle 

(0) lOPOLOCti 

(ii) MOLECPtf TTPCl COHA 

tvi) original souncfi 

1 (A) ORGANXSKi Homo eepiene 



40 



40 



(xi) SEQUENCE DESCRIPTION: SEO 10 »Oi23r 

CTAAATAAAT TATTTTATCA TATTTTTTAA AATTATTTAA . 

<2) XHFOW1ATXOK FOR SEQ ID NO:24: 

(i) SEQUENCE CHARACTERISTICS! 
11 (A) LENOTHi M btto P4i« 
B) HPtt mielelc Mid 

(C STAANDEDNESSi Single 

<D) TOPOLOGY i linear 

MOLECULE cOKA 

Cvl) ORIGINAL SOUKCE: 

11 |A) ONANISM: Ho*0 lipUM 

(si) SEQUENCE DESCRIPTION: SEQ ID NO:24i 
CATOATCTTA TCTCTATTTA CCTATACTCT AAATTATACC ATCTATAATC TCCTTAATTT SO 

64 

TTAG 

(2) INFORMATION FOR SEQ 10 NO:2S: 

<i) SEQUEK3 CHARACTERISTICS: 
(A) LSROTRt 52 biw peira 
(8) TYPE: nucleic eeld 
(C) STRANDEDNESS: tingle 
(0) TOPOLOGY: linear 

<ii) KOLECULE TYPEi COMA 

(vi) ORXCXKAL SOURCE: 

(A) ORGANISE H«no .apiant 



(Xi) SEQUENCE DESCRIPTION: SCO X© NO:2S: 
CTAACAOAAC ATTACAAACC CTCCTCACTA ATCCCATCAC TACITTCCTA AC 52 
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(2) iwforkatxoh n% s*Q « «0;24i 

(1> (A) LtHCTHl 46 M fiUi 
{81 TYPE: ouclolc tcld 

(0) WPOLWT: linttr 

(11) KOLECULt TYPE« cDKA 

(A) ORGAKXSKi Homo MpUno 

jequehce DwcwrrioK: ito xo w«a« 

CCXTXTTXXX CTOCTAXm TCTTTCtAAA CTCAYTTOCC CCACAC 46 

<2) xhfokmatxo* fo* ito xo «ota7i 

(11 STQCTKCC CHARACTERISTIC* I 
11 (A) LENGTH t 40 bAM p*trt 

(6) TYPEi n«cl«lc acid 

CO rrPAKDEOHTSii ilngU 

(11) MOLECULE TYPE J COM A 

(vl) ORIGINAL SOURCE: 
11 (A) ORGANISM* Hooo 

fxl) SEQUENCE DESCRIPTION: SEQ IO N0i27: 

CTATCTTCTC TATACTCTAC ATCCTACTCC ATCTTTCAAA 40 

(2) IKfORKATIOK FOR SEQ XD NO: 28: 

(1) SEQUENCE CHARACTERISTICS: • 
(A) LENGTH: 66 b«it pAirs 
IB) TYPE: nucleic *cid 

(C) STRANDEDNSSS: slngl* 

(D) TOPOLOGY i llntir 

(11) MOLECULE TTPE: COHA 

(*i) OR26XKAL SOURCE: 

(A) ORGANISM: H«o npUAl 

(Xi) SEQUENCE DESCRIPTION: SEQ XD NOi2S: 
CATCATTCCT CTTCAAATAA CAAACCATTA TGCTTTATCT TCATTTTATT TTTCAC »6 
(2) INFORMATION FOP SEQ 10 *°»2* 

(II SEQUENCE CHARACTERISTICS: 
1 (A) LENGTH: 43 5«M P4lr« 

(») TTPE i nuclaie *cld 

(C) STRANDEDWESS: tingle 

(0) topology: llnttr 
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(U) MOLECULE TTPEi cOHA 

ORXCXKAL SOORCEf ^ . 
(A) ORCAHXSR: BP*© itpl«nt 

, xi) SEQUENCE DtSCRirTXOHi SS 8 » ^ 
GTAAOACAAA AATC7TTTTT RATCACATAG ACAATTACI6 CTC 
(2) XNfORHATXON FOR KO » »°'> 0 ' 
CD SfgOlKCS aARACIERXSTXCSf 
11 (A) IXMCTBi 40 bw ptlrf 
IB) TTMi fweUie WW 
(C CWAIfDW«Mt slnaXs 
(t>) TO*OL06I» XiMtr 

(ti) KOltCOX* Tfffi COKA 

( CA) OROAKXSMi Hoao ..pUns 

(XI, SEQOtNCE DKCMWOH: SCO » WiJOl 
TTACXWrT CTCTTTTTCC tCTTCCCCTT TTTAAATTAO 
(2) INFORMATION FOR StO XO NOtJlt 
i I) SEOUEHC2 CHARACTERISTICS! 
IB) WPEi nueXsle scld 

(pi mAKDttmsii* «vn9X« 

(D) TOPOLOOIi llMAr 
(ii) MOLECULE T»Ftt COKA 

fvl) ORIGINAL S03RCZ: 

I (A) ORCMISK: Boo© l*pi«R» 

fxl) SEQUENCE DESCRIPTION I SCQ XO KO:3Xi 

ctatctmw ataacatota tttctiaaca tascicaggt atoa 

(2) XWFORMXTXON FOR «8 IE NOi32: 

fll SEQUENCE CHARACTERISTICS! 
11 (AT LENCIHt S4 bts« pairs 

(B) TTPE* iwcXtic scid 

(C) STAANDEONESSs • mgl* 

(D) TOPOWCXt Xinttr 

HOUCOtE TTPIs C0NA 

fvi) ORIGINAL SOUAtti 

II (A) ORCAHXSH: Bc*c Bsplmt 
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— TTTAATCATC ctciattctc tatttaattt acac 

„, ZNfOMftTXOK WR «« » Ml Ml 
111 StQWHCI CHMUCTtMSTIC«« 

B nrti »«ei«ie «cxd 
C STRAK»«Ort«« .ll»8lt 

(0) topoukY: linear 

(U) KOUCOU TXW« CWK 

ivi) orxoinal eotmcxt 

oil a«w»a dsscmwiohi kb » 

GTACTATTTA OAATTTCACC TCT7TTTCTT TTTTCTCTTT TTCTT ^ 
CTCTC 

(2) XNfOWiATXOW TOR SEQ XD HO:34: 

fi) StOOTKCI CKAKACTBKX3TXCS. 

B TYMti nueUic «ctd 
C STUANDEDHESS. i*nfl« 
(D) TOPOLOGY. Unw 

(U) MOLECULE TYPE: cDHA 

(Vl) ORICIHAL SO0*CE: ^. 4-M 
1 (A) OHOANXSH: HOlM •tpU«« 

(Xi) SEQOEHCE PXSCRimOHi SEQ XD HO:34: 
GCAACTACTA TCAT7TTATG TATAAATTAA TCTAAAA7TG A7TAATTTCC AC " 
(2) IMFORKATIOH fOR SEQ XD MO:35i 
(i) SEQ0EWCE CHARACTERISTICS x 

ii) tmi meUlc acid 
<C> STUAHDEDHESSt olngl* 
(D) TOPOLOGY: Un«*C 

(ii) HOLKCWJ TYPIi CDHA 

IVil OJaCIHAL SOURCE: 

(A) CACAMS*! Homo •tpitnt 



(si) SEQUENCE ©tSCMPIXOH: SEQ XD HOi35t 
CTACCTTXCA AAACATTTA6 7ACTATAA7A TCAATTTCAT OT. 
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C, jnforkrtion roR ore « w>«3«' 

ill <tQCtHCT CKRAACTERMtXCSi 

II) «M! iweUU 
C «*AH0NI«S*. £*U 
(B) ICPOLOOti 

(ii) KOIXCUIX ItK) eDHA 

i Til ORIOIHAl. SOORCtl 

( " Vi) OR0RHX0H. Hobo MpUM 

(xl) stQtfEHCI OMCMPTXOO. «0 » NO«36: 
CCAACTCKAA tTAWWACC CATATTCA6A AACTTACTAC 40 

(2) xHPomn«i for «w xd w»ai» 

(B) WP0« nuel«le «eid 

(C) STMKSCONCSS* •i.ngl* 

(D) TOPOLOOTi ltMft* 

(11) MOLEC0U TTPF-: eDHR 

(vl) ORISIHM. *0URCt: 

11 (R| ©RCAKXOKi He** MP*™* 

«eB«W»« DESOtfPtXONt StQ XO KO:S7« 
CTXTRTRTRO ACTTTTATAT TACTmRAA CTRCRORRTT CATACTCTCA AAAA S4 
<2) XHFORKRtXOH FOR SIQ XO KO»38: 

U> MOOIHCI CKAHACttRXSTICS: 

(A) LEKCTKi 4X b«M P*lr» 
IB) HKi nucUic «cl« 

(C STRAHDCPNESSi ilngX* 
(D) TOPOLOST: linwr 

(U) MOUCOU nWi CDHR 

ivl) ORXCXKAt SOCRCtt 

1 (R) ORCAHXSKi Ho*o •tpien* 

(xi) SOQUWCE DWCRXPTXOHi StQ XO «Oi3R 
XTTOTCACCT TRATTTT6TC RTCTCTICAT TTlTRmCA C 

(2) IHrORMRTIOH FOR SGQ 10 RO«3»« 

ti) «aUEB« CKRRACTCRXIKCSi 
11 (R) UHBTHt 18 0«- P*ir» 

(B) ttPtt nueltie aeid 

(C) STRRJOtOJieSSi aing!* 
(0) TOPOLOCK li»«« 



i 
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<U) MOLECULE TYPE: COMA 

<vi) ORXGXMAL SOURCE! 

(A) ORCAKXSK: Home aaplana 

(Xl) SEQUENCE DESCRIPTION: SW 13 N0:J9: 
TCCCCCCCTC CC CC TC TC 
(2) INFORMATION FOR SEQ ID HOl40: 

(1) SEQUENCE CHARACTERISTICS* 

(A) LENGTH: 10 baaa paiFi 

(B) TYPE: nueUic acid 

(C) ITRAHDEDNESSi aingla 

(D) TOPOLOGZ: linear 

(UJ MOLECULE TYPE: CDNA 

(vi> ORXCIKAl, SOURCE! 

<A) ORGANISM: HoftO aapiftAt 

(xi) sequence descrxptzoiu seq 20 no:40: 

CCACCCCCCC CTCCC G T O 

(2) ZNFORKATXOK FOR SEQ ZD NOx41i 

(i> SEQUENCE CHARACTERISTICS: 
(X) LENGTH: 20 baaa pairs 
(8) TYPE: nuclaie acid 
(0) STRAW WHIfS s aingl* 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: COHA 

(Vt) ORIGINAL SOURCE t 

(A) ORGANISM 1 Mono aapUna 

(xi) SEQUENCE DESCRIPTION: SCQ XO NO:41: 

CTGAACCCCT CTCAXCCTCC 

(2) INFORMATION FOR StQ ZD NO:42i 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 19 baaa paira 
(8) TYPE: nucleic acid 
(C) STRAHDEDNESS i single 
(0) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDKA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Hootc aapiens 
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sWtTtMCt DESCRIPTION: SCO XD N0.42I 

XCGTCCOOGC ACCAATGGA 

( 2) INFORMXTXON FOR SEQ TO NO:43i 

(ii SEQUENCE OURACTERXSTICS: 
11 <X) LENGTH: 24 DOM P*i« 

(S) »Ki mieltlc tcid 

(CS STRANDEDNESS: BingU 

(8) TOPOLOGY: linoxr 

(ii) MOLECULE TTPE: eDNX 

ivi) ORI6INXL SOURCE i 

|V1) <J) ORCANXSH: Hocdp MpiWt 

(xi) SEQUENCE DESCRIPTION i SEQ XO HO: 42: 

ATCXTXTCTT XCCXXXTCXT ATAC 2# 

(2) INFORKATXON FOR ESQ XD NO:44: 

ri) SEQUENCE CaXRXCIERXtTXCSt 
(X) LENGTH: 23 booe p*xr« 

(B) TTPE: nucUic Add 

(C) STRXNDEDNESSi tinglo 
(0) TOPOtoOt: linear 

(ii) MOLECUtX TTPE i COKX 

(vij ORIGINAL SOURCE: 

(X) OROAKXEMi Homo •«pl«no 

(Xi) SEQUENCE DESCRIPTION SEQ 10 HOf44i 
TTXTTCCTXC TTCTTCtXTX CA6 23 
(2) XltFORXXfXOH POR SEQ 10 H0t4Si 

U) SEQUENCE CHXRXCTERXSTXCS: 
(X) LENGTH: 21 P*i«« 

(B) TYPE: nucloic «cid 

(C) STRXNWMESS: ft logic 
(0) TOPOLDW: llnoar 

(ii) KOLECUtT TTPE: cONX 

tvL) ORIGINAL SOURCE: 

(X) ORGXNXSKi Ho»o sapiftfli 

(Si) SEQUENCE DESCRIPTION: SEQ 2D NO?4S: 

_ 21 
TXCCCXTGCT CGCTCTTTTT C 
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(2) XMfOWtMXOH FOR SEQ XD *Oi46: 

( i) SEQUENCE CHARACTERISTICS* 
|A> UNCTKt XO b«M P*lf 
(Bi TYPE: nuelsic scid 
re STRANDXDNESSt singU 
(0) topology i linttr 

(il) KOLECUU TYPE: cDNA 

(vti original ooorces 

1 1 (A) ORCANXSXt Homo s*pi«ns 

(«!) SEQUENCE DESCRIPTION: SCO « KOi46i 

TCCCCCCATC rrcTTccrcx 

(2) INFORMATION FOR SIQ XO NOi47: 

(11 SEQUENCE CHARACTERISTICS: 
(A) LENGISi 22 DtS* P4ifi 
(») TTFIt mieloic Add 

(C) *TRAKOtDWtSI: siftglt 

(D) TOPOLOGY: lin«tx 

(ii) MOLECULE TYPE: cDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Bo«o ftplent 

(Si) SEQUENCE DESCRIPTION; SEQ XO NOl47: 

ACATTAGGCA CAAACCTTGC AA 

(2) INFORMATION FOR HQ XO MO; 46: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 22 Dsse pairs 
(S) TTPE: nucl*ic tcid 
(C) STRANDEDNESS: • Uflt 
(p) TOPOLOGY: llntST 

(11) MOLECULE TTPE: CDNA 

(▼1) ORIGINAL SOURCE! 

(A) ORGANISM: MOB© upitni 

(Si| SEQUENCE DESCRIPTION: SEQ XD NOi48i 

ATCAACCTCC AGTAAOAAGG TA 

(2) INFORMATION FOR SEQ XO H0:4$; 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH; XO fesst ptlrfl 
(8) TTPE i nucl«ic teld 

(C) STRANDEDNESS* 

(D) TOP0L00Y: liM&r 
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HOLICOtt mti tew 

< yt> "spas? J- ..pu«. 

StQUtHCE MfCMWW 3£Q ID l».«9. 
TCCCCCTCCT CCCTTCTTC 
(2) INFORKATXOK FOR StQ » HOlSOl 

li) SEQUENCE CHAAACtlRISTXCSt 
11 CAJLIKCTH: 20 b*M pair* 
(8) mSi mieUic acid 
(C 5TRAKD tO HISS : ilagXt 
(D) TOPOLOGTt lin«*r 

<ii) KOLECUIX TtPtx cDNA 

ivL\ ORIGINAL SOURCE: 

(xi) SEQUENCE DtSCRXPTIOm SEQ XP NOiSOi 
OCCCCTTCCt TTCTCACCAC 

(2) information roR seq » 

li) REQUEHCE CHARACTERISTICS! 

11 (I) LSHCTHi 21 p*ir» 
/f ) TTPEi nucltic tcid 
(C) STRANOMKESSs tingl* 
(p) TOPOiXJCTs li««4T 

(U) KOXXCULt TTPti CPKA 

(ri) ORIGINAL SOURCE I 

(A) ORGANISMS H«W Mpltftl 

(xt) SEQUENCE PSSCAIPTION; SEQ XP NO* SI* 
TTPTCTCCTO CCTCTTACXC C 
{2 ) INFORMATION fOR SEQ XD N0:52* 

(i) SEQUENCE CHARACT£*IS?ieS: 
W ?A1 MNGTHs 20 b*t# P4i« 

(8) Wit nuel«U teid 
C STRANPWNESSs iingl« 

(P| TOPOLflCT: lifter 

(li) KOIXCOLt TTPU COHR 

(vi) ORIGINAL SOURCE: 

11 (A) ORGANISMS KoaO Mpim 
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(iti) StQOTWC* DESCRIPTION SZQ ID HOi$2» 
ATGACACCCC CCATTCCCTC 
(2) JMrPWiATXOH FOR SEQ 10 HO:S3i 



20 



(i) SEQUZNCX CHARACTERISTICS I 

(A) LZHGTHi 24 b*M psiES 

(B) TTPt* nueUic iCid 

(C) ETAAKDEDKESSi iingit 
(0) TOtoLOGTs linear 

Cii) MOLtcmi TTMt CDKA 

<vi) OWCIHAt SOURCEl 

(A) ORCAWXSMl H0»o »*pi«nt 

<xl) It99XNCt DESCRIPTION SCO 2D KOi 13 1 

ccacttaaao cacatatatt tact 24 

(2) JMPORKATXON rOR SCO ID HO1M1 

(!) SEQUENCE CHARACTERISTICS* 
(A) LENGTH • 22 pairs 
(8) TXPli imeitle *eid 
<c> strandedhessi singls 

(D) TOPOLOCYi iil»«AT 
fil) MOLECULE TTft: CDKA 

|vi> ORXCXHAL SOURCE: 

(A) OAGAHISMi Bono sapisnt 

<xi) SSQUSHCR DESCRIPTION: S£Q XD HO:S4i 

CTATCGAAAA TACTGAAGAA CC 22 

(2) XNPORMAIXW POR SEQ XD KOsSS: 

li) SEQUENCE CHARACTERISTICS: 
<A) UMCTRi 24 b*»« pair* 

(B) TTPSi nuclaic acid 
(O stuahdedhessi «inpi« 
(D) TOPOLOGT: iinttr 

(ii) KQLECUtf TTPEi CDKA 

<vl) ORICIHA1 SOURCES 

(A) ORGANISM t HOOO •«pi«ni 



<xi) SEQUENCE DESCRIPTION; SEQ XD NOlSfts 
TTCTTAACTC CT CTTTTTCT TTTG 
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13) INFORMATION FOR «8 » 

ft) TYFH nucitic tcl« 
(C STRAND XDHXSS : •iogit 
(P) TOPOtOCTx iiftitr 

(ID kouctix rm* COKA 

<vl) ORICXKAL 60Wai 

1 (A) OACANXIN: Bodo tipitnt 



(xi) sBguiKC* PMCRimom «Q xp noiM* 

TTIAGAACCT TTTTTGTCTT CTC 23 

(2) INFORMATION FOR StQ XP NOiHi 

(1) SCQOENCS CKARACTtRXSTXCSi 
(I) UKCTHi 24 ptirt 
(8) TYFB: mieltlc tcid 
(C) sTRW^rPKMSt tingle 

(0) TOPOLOGY* liMtf 

(11) MOLICOU TTTtl cDHA 

(Yi) ORIGINAL SOORCR* 

<A) ORCANUMt BOfio s*pxant 



(111) StgCZKCE PSJQUPTIONi ftQ IP HO* 57: 

CTCAOATTAT ACACTAAGCC TAAC 24 

(2) INFORMATION FOR SSQ ID NOtSB: 

(1) BtQUINCE CHARACTERISTIC*: 
(A) UKCTUt 22 bast ptlrt 
(S) TIPS* nucitic ftCl* 

(C) rauaoromst tingit 

<P) TOPOLOGY t liM*r 

(11) HOLXCULS TYPli CfiKl 

(Ti) ORIGINAL SOOROi 

(A) ORGANISM* loao ttpitnt 

(xi) SEQUENCE DESCRIPTION i SSQ XP NOiSBi 
CATGTCTCTT ACAGTACTAC CX 22 
(2) INFORMATION FOR 3 CO IP HOi§»: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 30 t*s« pUrs 

(B) TYFEl nucitic teid 

(C) STRANDED KISS J tingle 
(P) TOPOLOGY: lift tar 
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(11) MOLECCXX TTPti cOHA 

(Tl| OAXCXHAL SOOTCti 

(A) OAOAKISMi Homo ••plans _ 

(si) StQUtNCt WCRIPTXOHt SCQ ID HO:S9: 

ACCTCCAACC CTAGCGAAG6 20 

(2) XNTORHATIOH fOA SCQ XB WO; 60: 

(1) BEQCEKCB CBARACTBAXITXCS: 
(A) LBUCIHt 2? pairs 
(t) TYfSi nuelale acid 
<C) 5TWOfDIOHI«« iingU 
(0) TOPOLOGTt itnaar 

(11) JCOLECtfLS TTPt* COKA 

(▼1) ORIGIN At fotmet: 

(A) OACANXSMi MOAO aaplant 

(Si) OQQtHCX PtSCAXWrOMi ICQ XO HO:60: 
TAAAAATOOA TAAACTAGAA TTAAAAC 27 
(2) XHFORHAT20N FOR MO 10 HOi61? 

(1) SSQQXKCS CHAXACTCTXSTXC3: 

(A) LTNCTH: 24 baaa paira 

(B) TTPXi nueXcle acid 

(C) STRAMMDHtBS* finglt 
(0) TOPOLOGY: Unaar 

(11) KOUCOXX Ttfti COKA 

(vi) ojucxhal sotmeti 

(A) ORGANISM; Hoao sapient 

(Si) StQOtNCt DtSCRIPTXONi ICQ ID NO16I: 
AAATACACAA TCATGICTTG AACT 24 
(2) XKFORXATXOH FOR SSQ XO HOt*2s 

(1) StQUWf^B CHARACTERISTICS t 

(A) ^ItCTMf 23 baa« pair* 

(B) TTPEt mielaie add 

(C) STRAND EDNCSSs iingla 
(0) TOPOLOGY t linear 

(11) MOLECULE TYPE 8 eONA 



(vi) ORXCXKAL SOURCE 1 

(A) OAGANISKj Homo eapiene 
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<*i) StQVWGt MSCAIPTXOH: StQ 10 MO! 62: 
ACXCCTAAAC ATCACAATTT CAC 
(2 ) IKfCRHXTXOIf FOR 550 IP 

li) MQUEKCE CHARACTERISTICS: 

(8) imi nucltic *cid 
C flRANOMHtSSt 8«\nglc 
(0) TOPOLOCTi linear 

<ii) MOIXCULE TTPEi COMA 
<vil ORIGINAL SOORCti 

<«i) fWWHCE DtSCRXPTXOM: SEQ TO 
TAACTTACAt AGCACTAATT TCCC 
(2) IKrORKATION POA SEQ XD HOtMl 
(!) StOUtHCC CHARACTERISTICS* 

(B) TTPEl IWCiCiC *Cid 
(C> STRAHDEDKESSj 8lnglt 
(D) TOPOtOS*i iifttar 

(ii) KOLZCOU TTPSi COMA 

tvi) OAIGtHAt SOtfRCRt 

|A) ORCAJIISK: HO»0 i«pi«ni 

( xi, 5EQUEMCE OtSCAXPIXONi StQ XD MO. 64: 

ACAATAAACT GCACTACACA A6C 

(2) IKrORHATTON FOR SSQ XD KO:6S: 

li) StQUTHCZ CHARACTERISTICS: 
W (A) tXNCTRi 23 »AM p*XT« 

(8) TTPS* weUie ceid 

(C) STAAHOEDHESSt 8l*gl« 
(0) TOPOLOC** iinur 

<ii) MOLECULE TfPEi COMA 

(vi) ORXCXMAL SOURCE: 

(A) ORCAMTSKi Ho«P ■■?&•«• 

f»i) MQUJEHCE DESCRIPTION: StQ XO KOi6*i 
MACCTCATT CCTTCTTCC? CAT 
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{X) INFORMATION TOR SEQ XD «0i«6: 

id sequence characteristics* 

(A) LENGTH I 34 p4iri 
(8) UPti miclolc #cid 

(C) STAANDEDNESSi •tnglt 

(D) TOPOLOGY: XiRtir 

(ii> MOLECULE Wti CDMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISMS Homo sapitM 

(Xl) SEQUENCE DESCRIPTION: SEQ 10 NO: 60: 

TGAATTTTAA TCGATTXCCT ACWT 24 

(2) INFORMATION FOR SEQ 16 HOi67: 

<i) SEQUENCE CHARACTERISTICS t 
(A) LENGTH: 33 b«*« pairs 
(8) TYFE: nueUlc aeic* 
<C) STRAHDEDNESS: sing la 
(D) TOPOLOGY: Xlnaar 

(ii) MOLECULE WE: CDNA 

(vt) ORIGINAL SOURCE I 

<A) ORCANISNl Homo ttpiant 

(Xi) SEQUENCE DESCRIPTION: SEQ ZD NOt$7i 

CTTTTTTTGC TTTTACTGAT TAACC 3S 

(2) XMFOFMATXON TOR StQ XD NO168: 

<i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 27 baaa paira 
(8) TYPE: nueiole aeid 

(C) STWOCOtONtSS: sinoU 

(D) TOPOLOGY t Xinaar 

(Li) MOLECULE TYPE: CONA 

(▼i) ORIGINAL SOURCt* 

(A) ORGANISM: Homo Mpitftl 

(Xi) SEQUENCE DESCRIPTION t SEQ ID NO: 60: 

TCTAATTCAT TTTATTCCTA ATACCTC 27 

<2) INFORMATION FOR SEQ ID NO: 69: 

<i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 34 pairs 
|6) TYPE i miclalc aeid 

(C) STRANSEDHESS t 8XAQU 

(D) TOPOLOGY: Xinaar 
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(Ui kouculx nni 

fvil ORIGINAL IOWCI: 

„ci, StOtTtKCE OCSCMFIXOH: SIQ » 

CGTACCCATA CTATCATTAT TTCT 

<2j XOTOMtATIOH FOR «*Q XD HO.70: 

(II StOtTtKCE caXMCTIAXSTXCSt 
11 (A) LWCtMt 24 Pti^i 
• Tim nuclsic acid 

(C fTFAHMWW* 1 

(D) tofodocti lln«*r 

(ii) KOLXCUIX TXVtt eOHA 

fvil ORICXKAL SOUACXl 

( (A) ORCMCIWtt Homo MpiCli 

l*i) 3C0WHCT MSCJtXPTXOK: SIQ XP HOiJO: 

24 

CTACCTATTT TTATACCCAC AAAC 

(2) XHrOWiATXOM FOA StQ IP HOi71: 
U) StQOtHCt efOAACrttlSTXCfx 

<A> twewt 23 P4xr» 
B Ttttt nucltic *eid 

(C) STRAKPICKXSf! •inglo 

(D) TOPOLOGY: lint AT 

(ii) KOLICDU TTPt: CDHA 

(vi) OftXGXKAL SOWCEl 

(A) ORGANISM t How •«?i«H8 

(xi) SCQOtKCE DCICAIPnON: StQ XP HOi7l: 

AAGAAACCCT ACACCAIItT TCC 

(2) INFORMATION FOR 8SQ IP FOt*2i 

(i) SEQOSHCf CHARACTERISTICS t 
(A) tWGTHs 23 base pairs 
(Bi mti nucliic scld 
(C) STRAItOCPKtSSt »inQlt 
<D) TOFOLOOti XiAMr 

(ii) KOLECUXX TTPC: cOHA 

(Ti) OAXOXKAL SOOTCt: 

(A) ORCANISK: Homo uptons 
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imij SEQUENCE description; sta xc " 0<72! 

23 

CATCATTCTT ACAACCATCT ICC 

(2) INFORMATION FOR SM XO HO? 73: 

(II SEQUENCE CHARACTERISTICS! 
(A) LENCTHi 24 MM P*ira 
(I) TYFEt nucltic acid 

(C) STRANDEDNESSJ oingi* 

(D) TOPOLQOT: lino** 

(it J MOLECULE TTPEi CDXA 

ivl) ORIGINAL SOURCE: 

(A) ORGANISM* Homo sapiani 

SEQUENCE OESCAIPTXOKl SEQ ID *0«73i 

ACCTATACTC TAAATTATAC CATC 24 

(2) INFORMATION POR SEQ ID NO: 74: 

(1) SEQUENCE CHARACTERISTICS* 
(A) LtKCTMt 20 baaa pairs 
(t) TTPE: nueUU aci* 

(C) STRANDEDNESSi aingU 

(D) TOPOLOGY: linear 

(ii) MOLECULE TTPE: COHA 

(vi) ORIGINAL SOURCE t 

(A) ORCAKISK; Homo aapians 

(Xl) SEQUENCE DESCRIPTION! SEQ 2D NO:74i 
CTCATCCCAT TACTCACCAC 20 
<2> INFORMATION TOR SEQ XD HOi75: 

(1) SEQUEKCE CHARACTERISTICS! 

(A) LENGTHS 24 p*i« 

(B) TTPEx nucltic acid 

(C) STRANDEDNESS: singla 

(D) TOPOLOGY J linoar 

(it) NOLECCLE TYPE: C3MA 

(Ti) ORIGINAL SOURCES 

(A) ORGANISM: Home aapiens 

(xl) SEQUENCE DESCRXPTXONi SEQ ID NO: 73: 
ACTCCTAATT TTCTTTCTAA ACTC 24 
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(2) INFORMATION FOR SEQ XO NO'™ 1 

11) SEQUENCE CHARACTERISTICS: 
W (A) LENGTH i 21 pairs 

(8) TXPEt nucleic meld 
c sTRANDEDNESSi tingli 

(0) TOPOLOGY I linear 

(11) MOLECULE TTPEl COHA 

(vi) original souace: 

1 ' <A| ORGANISM: Homo eapxene 

<xl) SEQUENCE DESCRIPTION: StO 10 HO* 76: 

TGAAGGACTC GCATTTCACC C 21 

(2) INFORMATION FOR SEQ TO HOi77l 

(1) SEQUENCE CHARACTERISTICS; 
1 ' (A) LENGTH* 23 bate peire 

(B) TTPCl BUClolC 4C14 
CC) STAANDEDHESSi aiiujle 
(0| TOPOLOGT: linear 

(U) MOLECULE TTPEt cOHA 

(vl) OAXGINAL SOURCE! 

(A) ORGANISM i Home etpiene 

<xi) SEQUENCE DESCRIPTION: SEQ 10 NO:77: 

TCATTCACTC ACACCCTGAT GAC 

(2) INFORMATION FOR SEQ XD NO*7S: 

(1) SEQUENCE CHARACTERISTICS t 
(A) LENGTH i 22 beee paixe 
(8) TXFE: nucleic ecid 

(C) STRANDEDHES S : eingle 

(D) TOPOLOGT: linear 

(11) MOLECULE TTPEi CONA 

(vl) ORIGINAL SOURCE: 

(A) OrOANlSMi MOOiO ejpLen. 

(xi) SEQUENCE DESCRIPTION* SEQ 10 NO*7Si 

GCTTTGAAAC ATCCACTACO AT 22 

(2) INFORMATION FOR SEQ ID NO* 79* 

(1) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 24 beee paire 
<B) TYPSt nucleic aeid 
(C) STRAKDE0NESS: tingle 
(0) TOPOLOGT: linear 
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(11) MOLECULE TT Ft: CDMA 

(rl) ORIGINAL SOURCE: 

(A) ORGANISM: Hoao aaplana 



(Xl) SEQUENCE DESCRIPTION; SEQ ID H0t?9l 

AAACATCATT CCTCTTCAAA TAAC 

(2) INFORMATION FOA SEQ XD NO: 80: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 Data palra 
(•) TYRE: nuelalc acid 

(C) STRANDEDNESS: aingla 

(D) TOPOLOGY t Una ar 

(11) MOLECULE TYPE: cDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM: Noao aaplant 



(xl) SEQUENCE DESCRIPTION: SEQ XD NO:60: 

TACCATGAT? TAAAAATCCA CCAC 

(2) INFORMATION FOR SEQ XD NO: 81: 

|1) SEQUENCE CHARACTERISTICS* 
(A) LENGTH: 23 Daft palra 
<B) TYPE; nucltlc add 
(t) STRANDED NESS i ainol* 
(D) TOPOLOGY: Hnaar 

(11) MOLECULE TYPE: CDNA 

(vl) ORIGINAL SOURCE: 

(A) ORGANISM I Homo aapLcna 



(Xl) StQUWet DESCRIPTION] SEQ XD NOifllt 
GATGATTGTC TTTTTCCTCT TGC 
(2) INFORMATION FOR SEQ XD NO: 82: 

<1) SEQUENCE CHARACTERISTICS: 

(A) LEKOTHi 24 Dai* pairs 

(B) TYPE: mielcie add 

(C) STRANDEDNESS: tingl* 

(D) TOPOLOGYi Untax 

(11) MOLECULE TYPE: cDNA 

<vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo aapiana 
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(xi) SEQUENCE DESCRIPTION: SCQ 10 *0:*f- 
CTOACCTATC TTAACAAATA CATC 
(2) INFORMATION FOR SEQ ID HO; 03: 

U) SEQUENCE CHARACTERISTICS; 

1 (A) tXHGTHi 25 D4M H tri 
(8) TYPE: mieitic tcid 
(C) STRANDEDNESS* singls 
<D) TOPOLOGY: linssr 

(ii) MOIXCOLE TYPE: cOWA 

(vl) ORIGINAL SOURCE: 

(X) ORGANISM* Homo Mpi«n« 

(Xi) SEQUENCE DESCRIPTION! SEQ ID HOf«3t 

TTTTAAATOA TCCTCIATTC TGTAT 

(2) INFORMATION FOR SEC ID NO: 64 : 

(i) SEQUENCX CHARACTERISTICS t 
(A) LENGTH: 24 *«•• p*irs 
(8) TYPE: mieitic i:W 

(C) STRANDEDNESS • tlfiglo 

(D) TOPOLOGY! lin««r 

(it) MOLECULE TTPEs eDNA 

IVi) ORIGINAL SOURCE x 

(A) ©ACAHISH? Homo •*pr«ft» 

(xi) SEQUENCE DESCRIPTION: SEQ 10 NOU4: 

ACAGACTCAS ACCCT CCC T C AAAG 

(2) INFORMATION fOR SEQ TO NOiSSi 

(1) SEOOtNCr CHARACTERISTICS: 
(A) LENGTH: 23 **»e psUi 
(■) TYPE: micloic *cld 
(C) STRANDEDNISS: iingle 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : HOOO »ftpi«M 

(Xi) SEQUSNCE DESCRIPTION i SEQ ID NO: 85: 
TTTCTATTCT TACTCCTA5C ATT 
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(2) INFORMATION FOR SEQ ID NO:86: 

<i) SEQUENCE CKAAACTEAIS7ICS: 
(A) LENGTH: 22 b4M peiri 
(8) TTPE: nucleic Acid 
(C) STRANDM**" 1 
(0) TOPOLOGY: linear 

(ii) MOLECULE TTPCt COHA 

(vl) ORIGINAL SOURCE: 

(A) OACANISM: Heae espiene 

<«i) SEQUENCE DESCRIPTION: SCQ ID NO:6e: 
ATACACAGCT AA6AAATTAC OA 22 
|2) INFORMATION FOR SEQ ID MO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 bete paire 

(B) TYPE i nuclele acid 

(C) STRAHDEDNESS: tingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDMA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo eapiena 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:87: 
TAOATCACCC ATATTCTCTT TC 22 
(2) IK70RKATION FOR SEQ ID NOtSSi 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH I 22 eate pelra 

(B) TYPE: nuclele acid 

(C) STRAND EDNESS I eiftfle 

(D) TOPOLOGY t linear 

(ii) HOLECGLE TYPE: CDNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Home iapicnt 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 06: 
CAATTAGGTC TTTTTGAGAG TA 22 
(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 baae peira 

(B) TYPE: nueleie acid 
<C) STRAND EDNESS : finglc 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE i cONA 

ivil ORIGINAL SOURCE: 

(A) ORGANISM: How© •aplent 

(jci) SEQUENCE DESCRIPTION; SEQ 10 NO:S9: 

C7TACTGCAT ACACATTCTC AC 

(2) INFORMATION FOR SEQ X© HOi90: 

li) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 23 **t« P*t" 
IB) TYPE: nueUie acid 

(C) STRANDEtNESS: a in? la 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cONA 

(vi) ORIGINAL SOURCE: 

(A) ORCANXSK: Homo •ipitnt 



(si) SEQUENCE DESCRIPTION: SEQ 10 NOttO: 

OC TTTTTC TT TCCTAACATO AAO 

(2) INFORMATION FOR SEQ ID NO: 91: 

(1) SEQUENCE CHARACTERISTICS: 
(A) LENGTH I 21 b*»t pairi 
(t) TTPE: nucltic icU 
<C> STRANDC9NESS : tingle 
(D) TOPOLOGY: linaar 

(ii) MOLECULE TYPE: cONA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Noma sapient 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:91» 

TCTCCCACAC CTAATACTCC C 

(2) INFORMATION FOR SEQ XO KO:92: 

ii) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 21 P*L" 
(8) TYPE: nucitic acid 
(C) STRANOCDNESS: single 
(P) TOPOLOGY: lift«*t 

(ii) MOLECULE TYPE; cONA 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mono sapians 
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IjeI) SEQOTHCE DESCRIPTION SIQ ID HO! 92: 

CCTACAACTG AATCGCCTAC 0 21 

(2) IHTOAMATXON FOR «Q XD WO: 93: 

(i| SEQUENCE CHARACTERISTICS: 
I A) LENGTH: 22 btst peire 
<B) TYPEs nucleic teid 
<C) STRAHDEOKESS: tingle 
{£>) TOPOLOGTt linear 

(Li) MOLECULE TTfE: cDKA 

(vi) ORIGINAL SOURCE i 

(A) ORGANISM! Hone eaplent 

(xi) SEQUENCE DESCRXPTXON: SIQ XD 110:93: 

CACCACAAAA TAATCCTOTC CC 22 

(2) INFORMATION TOR SEQ ID KOxHi 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH* 24 bete peire 
(9) TYPE i nucleic ecid 
(C) STRANDEDNtSS: single 
(0) TOPOLOGXi linetr 

(ii) MOLECULE TTFE » COMA 

tri) ORIGINAL SOTRCtt 

(A) OROAMISMi Hone eepiene 



(Xi) SEQUENCE DESCRIPTION SEQ XD MO:94t 
ATTTTCTTA C TTTCATT C IT CCTC 
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CLAIMS 

1. A method of diagnosing or proposing a neoplastic tissue 
o! a human, comprising: 

detecting somatic alteration o; wild-type APC gene cod- 
ing sequences or their expression products in a tumor tissue isolated 
from a human, said alteration indicating neoplasia of the tissue 

2. The method of claim 1 wherein the expression products 
are mRNA molecules. 

3. The method of claim 2 wherein the alteration of 
wild-type APC mRNA Is detected by hybridization of mRNA from said 
tissue to an APC gene probe. 

4. The method of claim 1 wherein alteration of wild-type 
APC gene coding sequences is detected by observing shifts in 
electrophoreUc mobility of single-stranded dka on non-denaturing 
polyacrylamide gels. 

5. The method of claim 1 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA isolated from said tissue. 

6. The method of claim S further comprising: 

subjecting genomic. ON A isolated from a non-neoplastic 
tissue of the human to Southern hybridization with the APC gene cod- 
ing sequence probe: and 

comparing the hybridizations of the APC gene probe to 
said tumor and non-neoplastic tissues. 

7. The method of claim 5 wherein the APC gene probe 
detects a restriction f rapnent length polymorphism. 

8. The method of claim l wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of ail or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from that 
of the sequence shown in Figure 7 (SEQ ID NO.: J) suggesting neoplasia. 

9. The method of claim 1 wherein the alteration of wild- 
type APC gene coding sequences is detected by identifying a mismatch 
between molecules (l) an APC gene or APC mRNA isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wild- 
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type APC gene coding sequence, when molecules (l) and (2) are hybrid- 
i2£d to each otner to form a duplex. 

10. The method of claim 5 wherein the APC gene probe 
hybrldl2es to an exon selected from the group consisting of: (l) 

' nucleotides 822 to 930; and (2) nucleotides 931 to 1309; (3) nucleotides 
1406 to 1545; and (4) nucleotides 1956 to 2256. 

11. The method of claim 1 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene sequences in said tissue and hybridization of the amplified APC 
sequences to nucleic acid probes which comprise APC sequences. 

12. The method of claim 1 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene. 

13. The method of claim l wherein the detection of alter 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

14. The metnod of claim l wherein the detection of alter* 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

15. The method of claim 1 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an insertion mutation. 

16. The method of claim 1 wherein the tumor tissue is a 
colorectal tissue. 

17. The method of claim 6 wherein the nonneoplastic tissue 
isolated from a human is from colonic mucosa. 

18. The method of claim l wherein the expression products 
are protein molecules. 

19. The method of claim 18 wherein the alteration of 
wild-type APC protein Is detected by immunoblotting. 

20. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by immunocytochemistry. 
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21. The method of claim 18 wherein the alteration of 
wild-type APC protein Is detected by assaying for binding interactions 
between APC protein of said tumor tissue and a second cellular protein. 

22. The method of claim 21 wherein the second cellular pro- 
tein is selected from the group consisting of MCC protein, wild-type 
APC protein, and a G protein. 

23. The method of claim 18 wherein the alteration of 
wild-type APC protein is detected by assaying for phospholipid 
metabolites. 

24. A method of supplying wild-type APC gene function to a 
cell which has lost said function by virtue of a mutation in an APC 
gene, comprising; 

introducing a wild-type APC gene Into a cell which has 
lost said gene function such that said wild- type APC gene is expressed 
in the cell. 

25. The method of claim 24 wherein the wild-type APC gene 
introduced recombines with the endogenous mutant APC gene present 
In the ceil by a double recombination event to correct the APC gene 
mutation. 

26. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation In an APC 
*ene. comprising: 

introducing a portion of a wild-type APC gene into a cell 
which has lost said gene function such that said portion is expressed in 
the ceil, said portion encoding a part of the APC protein which is 
required for non-neoplastie growth of said cell. 

2?. A method of supplying wild-type APC gene function to a 
cell which has altered APC function by virtue of a mutation in an APC 
gene, comprising: 

applying human wild-type APC protein to a cell which has 
lost wild-type APC function. 

28. A method of supplying wild-type APC gene function to a 
cell which has altered APC gene function by virtue of a mutation in an 
APC gene, comprising: 
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introducing into the cell a molecule which mimics the 
function of wild-type APC protein. 

29. A pair of single stranded DNA primers for determination 
of a nucleotide sequence of an APC gene by polymerase chain reaction, 
the sequence of said primers being derived from chromosome Sq band 
21, wherein the use of said primers in a polymerase chain reaction 
results in synthesis of DNA having all or part of the sequence shown in 
Figure 7. 

SO. The primers of claim 29 which have restriction enzyme 
sites at each S 1 end. 

31. The pair of primers of claim 29 having sequences corre- 
sponding to APC Intro as. 

32. A nucleic acid probe complementary to human wild-type 
APC gene coding sequences, 

33. The nucleic acid probe of claim 31 which hybridizes to an 
ezon selected from the group consisting of: (1) nucleotides 822 to 930; 
and (2) nucleotides 931 to 1309; (3) nucleotides 1406 to 154$; (4) 
nucleotides 1966 to 2256. 

34. A kit for detecting alteration of wild-type APC genes 
comprising a battery of nucleic add probes which in the aggregate 
hybri<fize to all nucleotides of the APC gene coding sequences. 

35. A method of detecting the presence of a neoplastic tissue 
in a human, comprising: 

detecting in a body sample isolated from a human alter- 
ation of a wild-type APC gene coding sequence or wild-type APC 
expressio:* product, said alteration indicating the presence of a 
neoplastic tissue in the human. 

36. The method of claim 35 wherein said body sample is 
selected from the group consisting of serum, stool, urine and sputum. 

37. A method of detecting genetic predisposition to cancer, 
including familial adenomatous polyposis (FAP) and Gardner's Syndrome 
(GS), in a human comprising: 

detecting a germline alteration of wild-type APC gene 
coding sequences or their expression products In a human sample 
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selected from the group consisting of blood and fetal tissue, said alter- 
ation indicating predisposition to cancer. 

38. The method of claim 37 wherein the expression products 
are mRNA molecules. 

39. The method or claim 38 wherein the alteration of 
wild-type APC mRNA is detected by hybridization of mRNA from said 
tissue to an APC gene probe. 

40. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences Is detected by observing shifts In 
electrophoretic mobility of single-stranded DNA on non-denaturing 
polyacrylamide gels. 

41. The method of claim 37 wherein alteration of wild-type 
APC gene coding sequences is detected by hybridization of an APC 
gene coding sequence probe to genomic DNA Isolated from said tissue. 

42. The method of claim 41 whereto the APC gene coding 
sequence probe detects a restriction fragment length polymorphism. 

43. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by determining the 
sequence of all or part of an APC gene in said tissue using a polymerase 
chain reaction, deviations in the APC sequence determined from the 
sequence of figure 7 suggesting predisposition to cancer. 

44. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by Identifying a mismatch 
between molecules (1) an APC gene or APC mRNA isolated from said 
tissue and (2) a nucleic acid probe complementary to the human wild- 
type APC gene coding sequence, when molecules (1) and (2) are hybrid- 
ized to each other to form a duplex. 

45. The method of claim 41 wherein the APC gene probe 
hybridizes to an exon selected from the group consisting of: 
(1) nucleotides 622 to 930; and (2) nucleotides 931 to 130S; (3) 
nucleotides 1406 to 1543 and (4) nucleotides 1956 to 2256. 

46. The method of claim 37 wherein the alteration of wild- 
type APC gene coding sequences is detected by amplification of APC 
gene sequences In said tissue and hybridization of the amplified APC 
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sequences to nucleic acid probes which comprise _APC gene coding 
sequences. 

47. The method of claim 37 wherein the alteration of 
wild-type APC gene coding sequences is detected by molecular cloning 
of the APC genes in said tissue and sequencing all or part of the cloned 
APC gene. 

48. The method of claim 37 wherein the detection of alter 
ation of wild-type APC gene coding sequences comprises screening for 
a deletion mutation. 

49. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
a point mutation. 

50. The method of claim 37 wherein the detection of alter- 
ation of wild-type APC gene coding sequences comprises screening for 
an Insertion mutation. 

91* The method of claim 37 wherein the expression products 
are protein molecules. 

62. The method of claim Si wherein the alteration of 
wild-type APC protein is detected by immunoblottihg. 

53. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by immunocytochemistry. 

54. The method of claim 51 wherein the alteration of 
wild-type APC protein is detected by assaying for binding interactions 
between APC protein isolated from said tissue and a second cellular 
protein. 

55. The method of claim 54 wherein the second cellUar pro- 
tein is selected from the group consisting of MCC protein, wild-type 
APC protein and a G protein. 

56. A method of screening for genetic predisposition to can- 
cer, including familial adenomatous polyposis (FAP) and Gardners Syn- 
drome (CS), in a human comprising: 

detecting among kindred persons the presence of a DMA 
polymorphism which is linked to a mutant APC allele In an individual 
having a genetic predisposition to cancer, aid Kindred being 



WO 92/13103 



-117- 



PCT/US92/00376 



genetically related to the individual, the presence of said polymorphism 
suggesting a predisposition to cancer. 

57. A preparation of the human APC protein substantially 
free of other human proteins, the amino acid sequence of said protein 
corresponding to that shown in Figure 3 or ? (SEQ ID NO: 1). 

38. a preparation of antibodies lmmunoreactive with a 
human APC protein and not substantially immunoreactive with other 
human proteins. 

39. A method of testing therapeutic agents for the ability to 
suppress a neoplastlcauy transformed phenotype, comprising: 

applying a test substance to a cultured epithelial cell 
which carries a mutation in an APC allele; 

determining whether said test substance suppresses the 
neoplastically transformed phenotype of the cell. 

60. The method of claim 59 wherein the cultured epithelial 
cell has been genetically engineered to carry the mutation in the APC 
allele. 

61. A method of testing therapeutic agents for the ability to 
suppress neoplastic growth, comprising: 

administering a test substance to an animal which carries 
a mutant APC allele in its genome; 

determining whether said test substance prevents or sup- 
presses the growth of tumors. 

62. a transgenic animal which carries a mutant APC allele 
from a second animal species in its genome. 

63. An animal which has been genetically engineered to con- 
tain an insertion mutation which disrupts an APC allele In its genome. 

64. A cDNA molecule which encodes a protein having the 
amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or 1). 

65. An isolated DNA molecule which encodes a protein having 
the amino acid sequence shown in Figure 3 or 7 (SEQ ID NO: 7 or I). 

66. A yeast artificial chromosome which is known as 37HC4. 
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TABLE OA 

Germline nutations of the A PC gene in FAP and GS Patients 



EXTRA-COLONIC 



NUCLEOTIDE AMINO 



ACID 



PATIENT 
DlSEAtf 


COPON 


CHANGE 


CHANGE 


fl 


219 


tCA->T£A 


f«r->Jcop 










u 


301 


CGA->JCA 


Afg->ftop 


u 


Ml 


CGA~>£3A 




¥uao* 








21 


41] 


CCC->TCC 


Ai9*>Cyt 


OtttOU 








10 


712 




t«»->$top 


Oit«o*t 








J744 


243 


CACAO->CAC 


•pWct-juneeion 


WO 


301 


CCA->TC* 




2127 


455 


CmCA->CTTCA 


ft am mm 


3712 


soo 


»->£ 


tjrt-»ftop 



24 ftiaditaU* 



• The nutated nucleotides are underlined. 
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TABLE OB 



Somatic Muationi in Sport die CRC ftiienif 
OS MCCi: OAO/|aa|i-> (Splice Donor) 

CAC/fBap 

Tl« MCC 145 etc*|/GGA.> (Splice Acceptor) 

ltd l/CCA 

T47 MCCW COO«>CIG AqoLeu 

Til MCC 490 TCO->TIO Scf>Ug 

T35 MCCW COO-»CAO Art*Gte 

T9I MCC (91 GCT->Crr Al».>V»r 

T34 AK 211 CCACT*>CC£a£CjCaCT (Icjenioo) 

TIJ APCJ3I CCA.»IGA Ar|»Stop 

TJJJ APC4J7 CAA/|tUr>CAA/icu (Splice Donor) 

1»I APCIJJI CaO-»IAC Clo-»Slop 



For tplicc site sutttioai. tlx eodon eearett to Ike mumioe a lilted 

The onderlined nucleotide* were count mill eve letter* represent iatrons. Urge ewe letten repreeeot exoas 
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TABLE ID 

S+Ougcg of P»m«r» U— d ter S SC» 4wym 

Ml 

I 

i 
i 





•f WKTgTttT if IITTMttCTBBE 



Al *v**i in r#M « *• r 10 r *ocnon. Tb« nm pnm* «a; 
ptir St* r * *t tie* « tmofiftt r ^« soodto jftmer o#e To* *• etc 
M |*^*t|>*m«fi tfi4t uo wilhift # ftp- Sinuf.tc fry in uuru 
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TABIjEJV 



S#v«n Diftartnt Vaaioiu of tha JfrAmino Add Aapaaf 
Coamawc r VE* TP' CF$A' JSUSLS 

1Mb YCV60TM CFSACSSISSLS 

1378; HTVQETPlMFSRCTSVSStO 

U9t FATES TP OQPSCSSSISAIS 

1S43; YCVEfiTPI NfSTATSLSOLT 

ISO: T P I EGTPYCFSRNOSLSSlD 

1153: f At ENTPVCPSHNSSISUS 

20*3: fMV£OTPVCFS«N$$tS$LS 

Nvmbaa itwf l*» fi'Jt ammo icid ei «ie."> neiai Tha eonaa/uua 
«qv«nei it 3»a vp r»«tc:i i /r.ajcniy ammo ar a glvan peaitien. 
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B 



Mvfctrt 



p7 
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soon*) 

SMfllM 



Cbntigl 



Contig 2 



UC«t 

YACl 



an n: 



Gtr*f 

Ml*** 



ycc w 



Contig3 
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YACl 



•4M1 
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A) TBI Amino Acid Seouence 

VAPVYY6SGR APRHPAPAAH HPRRPOGFDG 16YRGGARDE QGF66AFPAR SFST6SDL6H (0 
WVTTPP0IP6 SRNLHWGEKS PPYGVPTTST PYEGPTEEPF 5SG6GGSVOG GSSEOLMRFA 120 
6FGIGLASLF TENVLAHPCI VLRRQCQVNY HAQHYHLTPF TVINIHYSFN KTGGPRAUOC 130 
GMGSTFIVQG VTIGAEGZXS EFTPIPREVI HKVSPKQI6E HU.UC51YYV VAMPFYSASl 240 
ZETVQSEIIR ONTGXtECVK EGZGRVZGNG VPHSKRLLPl LSLIFPTVLH GVLHYZISSV 100 
IOKFVILZLK RCTYKSHUE STSPVQSMLO AYFPELIANF AASLC SDVIL YPLETVLHRl 360 
fljflfiflTTTft MTPieVfVLt THTOVffiHKD CIMTIMEES VPCFYTCFGA VIIOTTLHAA 420 
VLQITKZZYS TUO . 434 

B) T62 Amino Aczo Sequence 

ELRRFORFLN EKNCNTOLLA KLEAICTGVNR SFZALGVZGL VALYLVFGYG ASLLCNLZGF 60 

6YPAYZSZKA ZESPNKEDOT QM.TYWYYG YFSZAEFFSO ZFLSHFPFYY ZUCCGFU.WC 120 

KAPSPSNGAE LLYKRZXRPF FLXHESQKOS WCTUCDUX ETABAITKEA KKATVNLLGE 180 

EKXST 1SS 
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APC AMINO ACZP SEQUENCE 

HAAASYOQLl KQVEAIKHEN SNLRQELEOM SNHITKLETE ASNNKEVLKQ IQGSIEDEAM (0 
ASSGQZOLLE RUCEINIOSS NFPCVKLRSK HSLRSYGSRE 6SVSSRS6EC SPVPMGSFPR 120 
RGFVNGSRES TGYLEELEKE RSLLLAOLOK EEKEKOVYYA QLQHLTKRIO SLLYENFSLQ ISO 
TOKYRRQLEY EARQIRVAME EOLCTCODHE KRAQRRIARI QQXEKOILRI RQLLQSQATE 240 
AERSSQNKHE TGSKOAERQN EGQGVGEXNM ATS6NG0GST TRMOHETASV LSSSSTHSAP 300 
RRLTSHLGTK VEHVYSLLSK LGTHDKDOMS RTLUNSSSQ OSCISNRQSG CLPLLIQLLH 360 
GMOKDSVLLG NSRGSKEARA RASAALMNII KSQPOOKRGR RHRVLHLLE QIRAYCETCV 420 
EWQEAHEP6X OQOKKPMPAf VEKQ2CPAVC VLMX5F0EE NRHANKELGG LQAIAELLOV 480 
DCEHYGITNO NYSITLRRYA GKALTNLTFG OVANKATLCS MK6CMRALVA QUSESEOLQ 540 
QVZASVLRNL SWRACYNSWC TLREVGSVKA LHECALEYKK ESTUCSVLSA LWNLSAHCYE 600 
NKAOICAVDG ALAFLVGTLT YRSQTNTUI IESGGGILRN VSSUATNED KRQILREKNC 660 
LQTLLQHUCS HSLTIVSNAC GTIUNISARK PKDQEALUDM 6AVJHUCHLI HSKHOttAHG 720 
SAAALRNLKA NRPAJCYKDAN ZNSPGSSLPS LHVRKQKALE AELOAQHLSE TFONXOKLSP 780 
KASHRSKQRK KQSLYGDYVF DTNRHDOMRS ONFMTGNKTV LSPYLXTTVL PSSSSSRGSL 840 
DSSRSEJCDRS LERER6IGL6 KYHPATENPG TSSttGUIS TTAAQIAXVN EEVSAXHTSQ 900 
EORSSGSfTE LHCVTOERHA LRRSSAAHTW SNTYNFTKSE NSNRTCSKPY AKLEYKRSSN 960 
OSLMSVSSSD GYGKRGQMKP SZESYSEOOE SKFC5YGQYP AOLAHKXKSA NHHOONOGEL 1020 
OTPXMYSLKY SOEQLNSGRQ SPSQNERUAR PKHIIE0E2K QSEQRQSRNQ STTYPVYTE5 1080 
TODKHUCFQP HFGQQECVSP YRSRGANGSE TNRVGSNHGZ NQNVSQSLCQ EOOYEOOKPT 1140 
NYSERYSEEE QHEEEERPTN YSHCYNEEJCR HVDQPXOYSL KYATDIPSSO KGSFSFSKSS 1200 
SGQSSICTEHH SSSSENTSTP SSNAJCRQNQL HPSSAOSRSG QPOXAATOV SSXNQETIQT 12(0 
YCVEOTPXCF SRCSSLSSLS SAEOEXGCNQ TTQOPOSANT LQIAEXKEKX 6TRSAE0PVS 1320 
EVPAVSQKPR TKSSRLOGSS LSSESARNKA VEFSSGAKSP SKS6AQTPC PPEHYVQETP 1380 
UtFSRCTSVS SL0SFESRS2 ASSVQSEPCS 6MVS6XXSPS DLPOSPGQ7K PPSRSKTPPP 1440 
PPQTAQTKRE VPKKKAPTAE KRES6PK0AA VNAAVQRVQV LPOAOYUKF AYESTPOGFS 1500 
CSSSLSAUL DEPFIQXDVE LRXNPPVQEK ONGNETESEQ PKESMENQEK EAEJCTXOSEK 1S60 
DUADSOODO XEXLEECXXS ANPTXSSRKA KXPAQTASKL PPPVARJCPSO LPVYKLLPSQ 1620 
NRLOPQJCMVS FTP600HPRV YCVE6TPINF STATSLSDLT XESPPNELAA GEGVR66AQS 1680 
GEFEXROTIP TEGRSTOEAQ GOCTSSVTXP ELDDKJCAEE6 DXLAECXNSA MPKGWHKPF 1740 
RVKKXMOQVQ QASASSUPN KNQLDGOOQC PTSPVKPXPQ NTEYRTRVRK KAOSXNNLNA 1800 
ERVFSONKDS KKQNUQOfSlC OFNOKLPNNE ORVRGSFAFD SPKHYYPXEG TPVCF SRNOS 1660 
LSSLOFDOOO VDLSREKAEL RKAJCEMCESE AJCYT5HTELT SNOQSAMCYQ AXAKQPINR6 1920 
QPKPXLOKQS TFPQSSKDIP ORGAATDEKL QNFAXENTPV CFSHKSSUS LSOXOQENNN 1980 
ICENEPXKETE PPOSQGEPSK PQAS6YAPKS FHVEDTPVCF SRNSSLSSLS XOSEOOaQE 2040 
aSSAKPUX KPSRUCGONE KHSPRNHGGX LGEOLTLOUC OXQRPQSEH6 LSPOSENFOtf 2100 
KAXQEGANSX VSSIHQAAAA ACISRQASSD IttSXLSLKSG XSLGSPFNLT POQEEKPFTS 2160 
KK6PRXLKPG EKSTLETXXI ESESKGXKGG ttVYWUTG KVRSNSEISG QMKQPIQANM 2220 
PSXSRGRTNX HXPGVRNSSS STSPVSKXGP PUCTPASKSP SEGQTATTSP RGAXPSVKSE 2286 
LSPVAROT5Q X6GSSKAPSR SGSROSTPSR PAQOPL5RPX QSPGRKSXSP 6RNGXSPPNK 2340 
LSQLPRTSSP STASTKSS6S GKHSYTSPGR QMSMNITKQ TGLSKNASSX PRSESASKGL 2400 
NQHNNGNGAK KKVELSRNSS TKSSGSESDR SERPVLVRQS YFIKEAPSPT LRRKLEESAS 2460 
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